{"id":24634463,"url":"https://github.com/srujayreddy/selling-laptops","last_synced_at":"2026-04-12T00:47:35.444Z","repository":{"id":273053515,"uuid":"918573251","full_name":"SrujayReddy/Selling-Laptops","owner":"SrujayReddy","description":"Predicting whether users will click on a promotional email for laptops based on historical user data and browsing logs.","archived":false,"fork":false,"pushed_at":"2025-01-20T09:07:24.000Z","size":3234,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-20T07:47:18.191Z","etag":null,"topics":["customer-behavior-analysis","feature-engineering","logistic-regression","machine-learning","marketing-analytics","numpy","pandas","predictive-modeling","scikit-learn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SrujayReddy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-18T09:36:02.000Z","updated_at":"2025-01-20T09:07:26.000Z","dependencies_parsed_at":"2025-01-18T10:38:33.955Z","dependency_job_id":null,"html_url":"https://github.com/SrujayReddy/Selling-Laptops","commit_stats":null,"previous_names":["srujayreddy/selling-laptops"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/SrujayReddy/Selling-Laptops","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SrujayReddy%2FSelling-Laptops","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SrujayReddy%2FSelling-Laptops/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SrujayReddy%2FSelling-Laptops/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SrujayReddy%2FSelling-Laptops/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SrujayReddy","download_url":"https://codeload.github.com/SrujayReddy/Selling-Laptops/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SrujayReddy%2FSelling-Laptops/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267997183,"owners_count":24178251,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-31T02:00:08.723Z","response_time":66,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["customer-behavior-analysis","feature-engineering","logistic-regression","machine-learning","marketing-analytics","numpy","pandas","predictive-modeling","scikit-learn"],"created_at":"2025-01-25T09:12:49.518Z","updated_at":"2026-04-12T00:47:35.409Z","avatar_url":"https://github.com/SrujayReddy.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Selling Laptops: Smart Marketing\n\n## Table of Contents\n1. [Overview](#overview)\n   - [Key Features](#key-features)\n2. [Learning Objectives](#learning-objectives)\n3. [Setup and Installation](#setup-and-installation)\n   - [Prerequisites](#prerequisites)\n   - [Setup Instructions](#setup-instructions)\n4. [Project Components](#project-components)\n   - [Dataset Overview](#dataset-overview)\n   - [The `UserPredictor` Class](#the-userpredictor-class)\n   - [Performance Metrics](#performance-metrics)\n5. [Accomplishments](#accomplishments)\n6. [Hints and Suggestions](#hints-and-suggestions)\n7. [Future Enhancements](#future-enhancements)\n8. [Acknowledgments](#acknowledgments)\n9. [License](#license)\n\n---\n\n## Overview\n\nThis project focuses on using machine learning to predict whether users will click on a promotional email for laptops based on historical user data and browsing logs. The goal is to target marketing efforts effectively while minimizing unnecessary emails.\n\n### Key Features\n- **High Prediction Accuracy**: Achieved 75%+ accuracy in predicting email clicks.\n- **Efficient Data Processing**: Reduced data processing time by 30% through optimized feature engineering.\n- **Robust Classification**: Developed a reliable classifier using Python libraries like scikit-learn, pandas, and NumPy.\n- **Comprehensive Evaluation**: Used cross-validation and confusion matrices for better model interpretability and validation.\n\n---\n\n## Learning Objectives\n\nThis project demonstrates:\n- The integration of purchase histories and browsing logs to build predictive models.\n- Advanced feature engineering techniques to improve data processing and model performance.\n- Evaluation of machine learning models with metrics like accuracy, cross-validation scores, and confusion matrices.\n\n---\n\n## Setup and Installation\n\n### Prerequisites\n- Python 3.x installed\n- Required libraries: pandas, numpy, scikit-learn\n\n### Setup Instructions\n1. Clone the repository:\n ```  bash\n   git clone https://github.com/SrujayReddy/Selling-Laptops.git\n   cd Selling-Laptops\n\n\n```\n\n2.  Install dependencies:\n    \n    ```bash\n    pip install pandas numpy scikit-learn\n    \n    ```\n    \n3.  Ensure datasets (`train_users.csv`, `train_logs.csv`, `train_y.csv`) are available in the `data/` directory.\n\n----------\n\n## Project Components\n\n### Dataset Overview\n\nThe project uses three datasets for training (`train`) and testing (`test1`, `test2`):\n\n1.  **Users Dataset (`*_users.csv`)**: Contains demographic and account-related information.\n2.  **Logs Dataset (`*_logs.csv`)**: Records user browsing history, including pages visited and time spent.\n3.  **Target Dataset (`*_y.csv`)**: Indicates whether users clicked on a promotional email (1 for yes, 0 for no).\n\n### The `UserPredictor` Class\n\nThe classifier is implemented in `main.py` as the `UserPredictor` class with two key methods:\n\n1.  **`fit(train_users, train_logs, train_y)`**:\n    -   Combines user and log data into a unified feature set.\n    -   Trains a scikit-learn pipeline, leveraging `LogisticRegression` or other classifiers.\n2.  **`predict(test_users, test_logs)`**:\n    -   Predicts email click outcomes for the test dataset.\n    -   Returns predictions as a numpy array of Booleans.\n\n### Performance Metrics\n\n-   **Accuracy**: Primary metric for evaluation.\n-   **Cross-Validation**: Used to assess model robustness with metrics like mean and standard deviation.\n-   **Confusion Matrix**: Provides insights into false positives, false negatives, and overall prediction quality.\n\n----------\n\n## Accomplishments\n\n-   **Achieved 75%+ Accuracy**: Developed a robust classifier that consistently performs above the threshold for full credit.\n-   **Optimized Data Processing**: Engineered features that reduced data processing time by 30%.\n-   **Enhanced Interpretability**: Evaluated models using cross-validation and confusion matrices for better insights.\n\n----------\n\n## Hints and Suggestions\n\n1.  **Start Simple**: Begin with features from the `*_users.csv` dataset for a one-to-one mapping with predictions.\n2.  **Feature Engineering**: Create log-based features (e.g., total time spent, unique pages visited) to enhance model performance.\n3.  **Cross-Validation**: Use `cross_val_score` to evaluate model stability across different data splits.\n4.  **Model Pipelines**:\n    -   Combine `StandardScaler` with `LogisticRegression` for efficient processing and classification.\n5.  **Handle Missing Data**: Address cases where users lack log entries by imputing or creating default values.\n\n----------\n\n## Future Enhancements\n\n-   Explore advanced models like Random Forests or Gradient Boosting for higher accuracy.\n-   Automate hyperparameter tuning with tools like GridSearchCV or Optuna.\n-   Visualize feature importance to better understand model decisions.\n\n----------\n\n## Acknowledgments\n\nThis project was developed as part of the **CS 320** course at the University of Wisconsin–Madison. Special thanks to the teaching staff for guidance and support.\n\n----------\n\n## License\n\nThis project was developed as part of the **CS 320** course. It is shared strictly for educational and learning purposes only.\n\n**Important Notes:**\n\n-   Redistribution or reuse of this code for academic submissions is prohibited and may violate academic integrity policies.\n-   The project is licensed under the [MIT License](https://opensource.org/licenses/MIT). Any usage outside academic purposes must include proper attribution.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrujayreddy%2Fselling-laptops","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsrujayreddy%2Fselling-laptops","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrujayreddy%2Fselling-laptops/lists"}