{"id":21702883,"url":"https://github.com/comsavvy/a-b-hypothesis-testing","last_synced_at":"2025-03-20T16:30:12.335Z","repository":{"id":180208084,"uuid":"287346894","full_name":"comsavvy/A-B-Hypothesis-testing","owner":"comsavvy","description":"Week 4 challenge @10Academy","archived":false,"fork":false,"pushed_at":"2020-08-15T19:58:10.000Z","size":259,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-25T15:29:21.660Z","etag":null,"topics":["algorithm","hypothesis-testing","machine-learning","plots"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/comsavvy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-08-13T17:57:04.000Z","updated_at":"2020-10-31T10:38:07.000Z","dependencies_parsed_at":null,"dependency_job_id":"64dc4e0c-72c3-4987-ab47-b6cd1604da68","html_url":"https://github.com/comsavvy/A-B-Hypothesis-testing","commit_stats":null,"previous_names":["comsavvy/a-b-hypothesis-testing"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/comsavvy%2FA-B-Hypothesis-testing","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/comsavvy%2FA-B-Hypothesis-testing/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/comsavvy%2FA-B-Hypothesis-testing/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/comsavvy%2FA-B-Hypothesis-testing/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/comsavvy","download_url":"https://codeload.github.com/comsavvy/A-B-Hypothesis-testing/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244649629,"owners_count":20487460,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","hypothesis-testing","machine-learning","plots"],"created_at":"2024-11-25T21:21:10.456Z","updated_at":"2025-03-20T16:30:12.328Z","avatar_url":"https://github.com/comsavvy.png","language":"Jupyter Notebook","readme":"# A-B-Hypothesis-testing\n## Task 2.1 : Classic and sequential A/B testing analysis\nPerform data exploration to count unique values of categorical variables, make histogram, relational, and other necessary plots to help understand the data. For each of the plots you produce, write a description of what the plot shows in markdown cells.\nPerform hypothesis testing: apply the classical p-value based algorithm and the  sequential A/B testing algorithm for which a starter code is provided..\nAre the number of data points in the experiment enough to make a reasonable judgement or should the company run a longer experiment? Remember that running the experiment longer may be costly for many reasons, so you should always optimize the number of samples to make a statistically sound decision.\nWhat does your A/B testing analysis tell you? Is brand awareness increased for the exposed group?\n## Task 2.2: Machine Learning\nIn max three statements, make a problem formulation for machine learning and specify the target variable\nSplit the data into 70% training, 20% validation, and 10% test sets. \nBased on the reading material provided, apply machine learning to the training data. Train a machine learning model using 5-fold cross validation the following 3 different algorithms:\nLogistic Regression \nDecision Trees\nXGBoost\nDefine the appropriate loss function  for the model using the validation data. \nCompute feature importance - what’s driving the model? Which parameters are important predictors for the different ML models? What contributes to the goal of gaining more “Yes” results?\nWhich data features are relevant to predicting the target variable?\nExplain what the difference is between using A/B testing to test a hypothesis vs using Machine learning to learn the viability of the same effect?\nExplain the purpose of training using k-fold cross validation instead of using the whole data to train the ML models?\nWhat information do you gain using the Machine Learning approach that you couldn’t obtain using A/B testing?\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcomsavvy%2Fa-b-hypothesis-testing","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcomsavvy%2Fa-b-hypothesis-testing","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcomsavvy%2Fa-b-hypothesis-testing/lists"}