{"id":27915452,"url":"https://github.com/aabbtree77/uci-marketing-analysis-cart","last_synced_at":"2025-07-29T07:07:16.257Z","repository":{"id":288961820,"uuid":"969672593","full_name":"aabbtree77/uci-marketing-analysis-cart","owner":"aabbtree77","description":"UCI bank marketing data analysis with decision trees (CART).","archived":false,"fork":false,"pushed_at":"2025-04-21T06:31:53.000Z","size":79,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-10T17:09:48.643Z","etag":null,"topics":["cart","chatgpt","commerce","conversion-rate","data-analysis","decision-trees","deepseek","grok","kovnatsky","marketing-analytics","miniconda","scikit-learn-python","uci-machine-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aabbtree77.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-20T17:21:07.000Z","updated_at":"2025-04-21T06:31:57.000Z","dependencies_parsed_at":"2025-06-10T17:09:37.826Z","dependency_job_id":"8c0e0b26-9e71-4f8d-8437-af797f6807f9","html_url":"https://github.com/aabbtree77/uci-marketing-analysis-cart","commit_stats":null,"previous_names":["aabbtree77/uci-marketing-analysis-cart"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aabbtree77/uci-marketing-analysis-cart","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aabbtree77%2Fuci-marketing-analysis-cart","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aabbtree77%2Fuci-marketing-analysis-cart/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aabbtree77%2Fuci-marketing-analysis-cart/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aabbtree77%2Fuci-marketing-analysis-cart/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aabbtree77","download_url":"https://codeload.github.com/aabbtree77/uci-marketing-analysis-cart/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aabbtree77%2Fuci-marketing-analysis-cart/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267644727,"owners_count":24120866,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cart","chatgpt","commerce","conversion-rate","data-analysis","decision-trees","deepseek","grok","kovnatsky","marketing-analytics","miniconda","scikit-learn-python","uci-machine-learning"],"created_at":"2025-05-06T15:54:31.937Z","updated_at":"2025-07-29T07:07:16.233Z","avatar_url":"https://github.com/aabbtree77.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ctable align=\"center\"\u003e\n    \u003ctr\u003e\n    \u003cth align=\"center\"\u003e Toretsk (Ukraine, April 2025)\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n    \u003ctd\u003e\n    \u003cimg src=\"./images/Toretsk-2025.jpg\"  alt=\"What remains of Toretsk, Ukraine, April 2025\" width=\"100%\" \u003e\n    \u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\n[...](https://www.reddit.com/r/UkraineRussiaReport/comments/1k2d20a/ua_pov_birdseye_view_of_what_remains_of_toretsk/)\n\n## The UCI Bank Marketing Campaign Decision Tree Analysis\n\nThis project analyzes [the UCI Bank Marketing Dataset](https://archive.ics.uci.edu/dataset/222/bank+marketing) using CART to predict customer subscription (a binary variable). A conversion rate is the average of the subscription value for a chosen data subset (market segment).\n\n## Python3 and miniconda (Ubuntu 22.04)\n\n```bash\nwget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh\nbash Miniconda3-latest-Linux-x86_64.sh\n```\n\nThe last step of miniconda (155MB) install: \"Do you wish to update your shell profile to automatically initialize conda?\" I have chosen \"No\", and simply initialize it manually:\n\n```bash\nsource /home/tokyo/miniconda3/etc/profile.d/conda.sh\n```\n\n## Dependencies\n\nEnvironment: \n\n```bash\nconda create -n banktree\nconda info --envs\n\n# conda environments:\n#\nbase                   /home/tokyo/miniconda3\nbanktree               /home/tokyo/miniconda3/envs/banktree\n\nconda activate banktree\n```\n\nDependencies:\n\n```bash\nconda install python=3.13\nconda install pandas scikit-learn\nconda install -c conda-forge ucimlrepo certifi \nconda install requests tabulate\n```\n\nExit and removal:\n\n```bash\nconda deactivate\nconda env remove --name banktree\nrm -rf /home/tokyo/miniconda3/envs/banktree\nconda clean --all\n```\n\n## Grok\n\n\"Give me the script which loads the UCI Bank Marketing Dataset, splits 20% into testing, builds CART, outputs training set sample number, accuracy, conversion rate, same for testing. Also, output top ten groups based on job and education with highest conversion rates and show sample numbers, nothing else.\"\n\n```bash\npython main_grok.py\nTraining set:\n  Sample number: 36168\n  Accuracy: 1.0000\n  Conversion rate: 0.1161\n\nTesting set:\n  Sample number: 9043\n  Accuracy: 0.8734\n  Conversion rate: 0.1206\n\nTop 10 groups based on job and education with highest conversion rates:\n          Job Education  Conversion Rate  Sample Number\n      student   primary         0.363636             44\n      student secondary         0.297244            508\n      retired  tertiary         0.275956            366\n      student  tertiary         0.264574            223\n      retired   primary         0.223899            795\n      retired secondary         0.210366            984\n   unemployed  tertiary         0.193772            289\n       admin.  tertiary         0.173077            572\n  blue-collar  tertiary         0.161074            149\nself-employed  tertiary         0.160864            833\n\n```\n\n## deepseek\n\n\"Give me the script which loads the UCI Bank Marketing Dataset, splits 20% into testing, builds CART, outputs training set sample number, accuracy, conversion rate, same for testing. Also, output top ten groups based on job and education with highest conversion rates and show sample numbers, nothing else.\"\n\n\"The link is https://archive.ics.uci.edu/static/public/222/bank+marketing.zip, and it's a zip file, not csv! Inside bank+marketing.zip there are bank.zip and bank-additional.zip. Inside bank.zip there is bank.csv (around 460 KB), bank-full.csv (around 4.6MB) and bank-names.txt 3.9 KB. Inside bank-additional.zip there is bank-additional folder inside it bank-additional.csv (around 584KB), bank-additional-full.csv (~5.8MB), and bank-additional-names.txt (~5.5KB).\"\n\n```bash\npython main_deepseek.py\nTraining Set\nSamples: 32,950\nAccuracy: 100.00%\nConversion Rate: 11.24%\n\nTesting Set\nSamples: 8,238\nAccuracy: 88.69%\nConversion Rate: 11.35%\n\nConversion Rate Ranges:\nMax: 35.35%\nMin: 18.64%\n\nTop 10 Job/Education Groups:\n|                                      |   Conversion_Rate |   Samples |\n|:-------------------------------------|------------------:|----------:|\n| ('student', 'basic.9y')              |          0.353535 |        99 |\n| ('student', 'unknown')               |          0.353293 |       167 |\n| ('retired', 'unknown')               |          0.336735 |        98 |\n| ('student', 'high.school')           |          0.319328 |       357 |\n| ('retired', 'basic.4y')              |          0.309883 |       597 |\n| ('retired', 'professional.course')   |          0.236515 |       241 |\n| ('retired', 'university.degree')     |          0.231579 |       285 |\n| ('retired', 'high.school')           |          0.224638 |       276 |\n| ('student', 'university.degree')     |          0.205882 |       170 |\n| ('housemaid', 'professional.course') |          0.186441 |        59 |\n\n```\n\n## Notes\n\n* As Leo Breiman has noted himself in 2001, CART is not the most accurate method.\n\n* CART is great in that it handles any data (missing, mixing continuous with nominal), and is automatic. It is also fast: no inverses, no learning, no GPUs needed. Ideal for rough estimates.\n\n* I would not spend too much time on the generated trees, clusters/rules, variable importance.\n\n* pip is horrible, but conda solves the problem. Jupyter Notebook is not that useful.\n\n* ChatGPT, deepseek, and Grok are great for such scripts, but one needs to debug/iterate.\n\n* [Artiom Kovnatsky](https://www.artiomkovnatsky.com/) uses CART in real-world commercial projects.\n   \n## References \n\n[A Conversation with Leo Breiman (2001)](https://projecteuclid.org/journals/statistical-science/volume-16/issue-2/A-Conversaton-with-Leo-Breiman/10.1214/ss/1009213290.full)        \n   \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faabbtree77%2Fuci-marketing-analysis-cart","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faabbtree77%2Fuci-marketing-analysis-cart","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faabbtree77%2Fuci-marketing-analysis-cart/lists"}