{"id":29280176,"url":"https://github.com/roderickqiu/data-mining-project","last_synced_at":"2025-07-05T14:39:28.574Z","repository":{"id":301497948,"uuid":"984093998","full_name":"RoderickQiu/data-mining-project","owner":"RoderickQiu","description":"Data Mining (CS306) Project, Spring 2025, SUSTech CSE.","archived":false,"fork":false,"pushed_at":"2025-06-27T06:19:39.000Z","size":2873,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-27T07:27:17.492Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RoderickQiu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-15T11:41:33.000Z","updated_at":"2025-06-27T06:19:43.000Z","dependencies_parsed_at":"2025-06-27T07:37:40.029Z","dependency_job_id":null,"html_url":"https://github.com/RoderickQiu/data-mining-project","commit_stats":null,"previous_names":["roderickqiu/data-mining-project"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/RoderickQiu/data-mining-project","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RoderickQiu%2Fdata-mining-project","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RoderickQiu%2Fdata-mining-project/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RoderickQiu%2Fdata-mining-project/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RoderickQiu%2Fdata-mining-project/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RoderickQiu","download_url":"https://codeload.github.com/RoderickQiu/data-mining-project/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RoderickQiu%2Fdata-mining-project/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263756756,"owners_count":23506593,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-05T14:39:25.858Z","updated_at":"2025-07-05T14:39:28.560Z","avatar_url":"https://github.com/RoderickQiu.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Intelligent Learning Recommendation System\n\nThis repository contains the codebase for a CS306 (Data Mining) final project, which implements an intelligent learning recommendation system. The system features a deep learning-based knowledge tracing model and a personalized question recommendation engine, with a modern web frontend for user interaction.\n\n## Project Structure\n\n```\ndata-mining-project/\n├── recommend/              # Standalone recommendation logic (legacy, see main.py for API)\n├── frontend/               # Modern React + TypeScript web interface\n├── logs/                   # Training and experiment logs\n├── saved_models/           # Trained model checkpoints (not tracked by git)\n├── main.py                 # FastAPI backend server (primary API)\n├── train.py                # Model training script\n├── test.py                 # Model evaluation script\n├── dataset.py              # Data loading utilities\n├── requirements.txt        # Python dependencies\n└── ...\n```\n\n---\n\n## Features\n\n### 1. Knowledge Tracing \u0026 Prediction\n\n- Implements a deep learning model (SAINT+) for sequential knowledge tracing.\n- Predicts the probability of a student answering the next question correctly, based on their historical responses, response times, and question categories.\n- Model is built with PyTorch for modularity and scalability.\n\n### 2. Personalized Question Recommendation\n\n- Question Recommendation: Suggests questions based on user mastery of knowledge tags, question difficulty, quality flags, tag overlap with user strengths/weaknesses, and benchmark tags.\n- Available via API endpoint `/recommend_advanced`.\n\n### 3. Modern Web Frontend\n\n- Built with React, TypeScript, Tailwind CSS, and shadcn/ui.\n- Provides interactive prediction and recommendation interfaces.\n- Real-time data visualization with Recharts.\n- Responsive design for desktop and mobile.\n\n---\n\n## Getting Started\n\n### Backend (FastAPI)\n\n#### Prerequisites\n\n- Python 3.8+\n- Install dependencies:\n  ```bash\n  pip install -r requirements.txt\n  ```\n\n#### Running the API Server\n\n1. Ensure the trained model checkpoint is available at `saved_models/best_model-v3.ckpt`.\n2. Adjust dataset paths in `main.py` or `backend/config.py` as needed.\n3. Start the server:\n   ```bash\n   uvicorn main:app --reload\n   ```\n4. The API will be available at `http://localhost:8000`.\n\n#### Key API Endpoints\n\n- `POST /predict`: Predicts the probability of correct answers for a sequence.\n- `POST /recommend_advanced`: Returns advanced personalized question recommendations.\n- `POST /question_stats_by_ids`: Retrieves metadata for a list of question IDs.\n\n### Model Training\n\n- To train the model from scratch:\n  ```bash\n  python train.py\n  ```\n- Training logs and metrics are saved in the `logs/` directory.\n\n### Frontend\n\n#### Setup \u0026 Run\n\n```bash\ncd frontend\nnpm install\nnpm start\n```\n\n- The app will be available at `http://localhost:3000`.\n- API base URL is configured in `src/services/api.ts` (default: `http://localhost:8000`).\n\n#### Features\n\n- **Learning Prediction**: Input question IDs, response times, and categories to visualize predicted probabilities.\n- **Question Recommendation**: Get personalized question suggestions using basic or advanced algorithms.\n- **Data Visualization**: Interactive charts for predictions and recommendations.\n\n#### Project Structure (Frontend)\n\n```\nfrontend/\n├── public/                 # Static assets\n├── src/\n│   ├── components/        # React components\n│   │   ├── ui/            # shadcn/ui base components\n│   │   ├── MainTab.tsx    # Main tab\n│   ├── services/          # API services\n│   │   └── api.ts         # API interface\n│   ├── lib/               # Utility functions\n│   │   └── utils.ts       # General utilities\n│   ├── App.tsx            # Main app component\n│   └── index.tsx          # App entry point\n├── package.json           # Project config\n└── tailwind.config.js     # Tailwind config\n```\n\n---\n\n## Datasets\n\n- The system uses the [Riiid! Answer Prediction](https://www.kaggle.com/competitions/riiid-test-answer-prediction/data) dataset.\n- Please download the dataset and adjust paths in `main.py` or `backend/config.py` as needed.\n\n---\n\n## Python Dependencies\n\n- torch\n- pytorch-lightning\n- pandas\n- numpy\n- scikit-learn\n- fastapi\n- uvicorn\n\nAs declared in `requirements.txt`.\n\n---\n\n## Logs \u0026 Model Checkpoints\n\n- Training logs are stored in `logs/`.\n- Model checkpoints are saved in `saved_models/` (not tracked by git due to size).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froderickqiu%2Fdata-mining-project","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Froderickqiu%2Fdata-mining-project","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froderickqiu%2Fdata-mining-project/lists"}