{"id":20733448,"url":"https://github.com/moindalvs/assignment-basic-stats-level1","last_synced_at":"2026-01-24T18:31:33.168Z","repository":{"id":127427911,"uuid":"459589648","full_name":"MoinDalvs/Assignment-Basic-Stats-Level1","owner":"MoinDalvs","description":"Assignment Basic Stats","archived":false,"fork":false,"pushed_at":"2022-03-03T07:20:21.000Z","size":450,"stargazers_count":1,"open_issues_count":0,"forks_count":3,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-15T07:10:01.400Z","etag":null,"topics":["bia","binomial-distribution","boxplot","confidence-intervals","datatypes","descriptive-statistics","inferential-statistics","normal-distribution","numpy","pandas","probability","python3","scipy-stats","statistics","z-score"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MoinDalvs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-02-15T13:26:54.000Z","updated_at":"2023-05-01T02:36:13.000Z","dependencies_parsed_at":"2023-08-16T10:03:40.441Z","dependency_job_id":null,"html_url":"https://github.com/MoinDalvs/Assignment-Basic-Stats-Level1","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/MoinDalvs/Assignment-Basic-Stats-Level1","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MoinDalvs%2FAssignment-Basic-Stats-Level1","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MoinDalvs%2FAssignment-Basic-Stats-Level1/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MoinDalvs%2FAssignment-Basic-Stats-Level1/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MoinDalvs%2FAssignment-Basic-Stats-Level1/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MoinDalvs","download_url":"https://codeload.github.com/MoinDalvs/Assignment-Basic-Stats-Level1/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MoinDalvs%2FAssignment-Basic-Stats-Level1/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28734307,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-24T17:51:25.893Z","status":"ssl_error","status_checked_at":"2026-01-24T17:50:48.377Z","response_time":89,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bia","binomial-distribution","boxplot","confidence-intervals","datatypes","descriptive-statistics","inferential-statistics","normal-distribution","numpy","pandas","probability","python3","scipy-stats","statistics","z-score"],"created_at":"2024-11-17T05:25:28.349Z","updated_at":"2026-01-24T18:31:33.154Z","avatar_url":"https://github.com/MoinDalvs.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Assignment Basic Stats\n\n## Q1) Identify the Data type for the Following:\n\n## Q2) Identify the Data types, which were among the following\n### Nominal, Ordinal, Interval, Ratio.\nData\t                        Data Type\nGender\t                      Nominal\nHigh School Class Ranking\t    Ordinal\nCelsius Temperature\t          Interval\nWeight\t                      Ratio\nHair Color\t                  Nominal\nSocioeconomic Status\t        Ordinal\nFahrenheit Temperature\t      Interval\nHeight\t                      Ratio\nType of living accommodation\tOrdinal\nLevel of Agreement\t          Ordinal\nIQ (Intelligence Scale)\t      Ratio\nSales Figures\t                Ratio\nBlood Group\t                  Nominal\nTime Of Day\t                  Ordinal\nTime on a Clock with Hands\t  Interval\nNumber of Children\t          Nominal\nReligious Preference\t        Nominal\nBarometer Pressure\t          Interval\nSAT Scores\t                  Interval\nYears of Education\t          Ordinal\n\n\n## Q3) Three Coins are tossed, find the probability that two heads and one tail are obtained?\nAns:\nP (Two heads and one tail) = N (Event (Two heads and one tail)) / N (Event (Three  \n                                               coins tossed))\n                                           = 3/8 = 0.375 = 37.5%\n\n## Q4) Two Dice are rolled, find the probability where sum is\na)\tEqual to 1\nb)\tLess than or equal to 4\nc)\tDivisible by 2 and 3\nAns:\nNumber of possible outcomes for the above event is\nN (Event (Two dice rolled)) = 6^2 = 36\na.)\tP (sum is Equal to 1) = ‘0’ zero null nada none.\nb.)\tP (Sum is less than or equal to 4) = N (Event (Sum is less than or equal to 4)) / N (Event (Two dice rolled))\n                                       = 6 / 36 = 1/6 = 0.166 = 16.66%\nc.)\tP (Sum is divisible by 2 and 3) = N (Event (Sum is divisible by 2 and 3)) / N(Event (Two dice rolled))\n                                    = 6 / 36 = 1/6 = 0.16 = 16.66%\n                                    \n## Q5) A bag contains 2 red, 3 green and 2 blue balls. Two balls are drawn at random. What is the probability that none of the balls drawn is blue?\nAns: Total number of balls =7 balls\nN (Event (2 balls are drawn randomly from bag) = 7! / 2! * 5! \n                                               = (7*6*5*4*3*2*1) / (2*1) * (5*4*3*2*1)\nN (Event (2 balls are drawn randomly from bag) = (7*6)/ (2*1) = 21\nIf none of them drawn 2 balls are blue = 7 – 2 = 5\nN (Event (None of the balls drawn is blue) = 5! / 2! * 3! = (5*4) / (2*1) \n                                           = 10\nP (None of the balls drawn is blue) = N (Event (None of the balls drawn is blue) / N (Event (2 balls are drawn randomly from bag)\n                                    = 10 / 21\n\n## Q6) Calculate the Expected number of candies for a randomly selected child \nBelow are the probabilities of count of candies for children (ignoring the nature of the child-Generalized view)\nCHILD\tCandies count\tProbability\nA\t1\t0.015\nB\t4\t0.20\nC\t3\t0.65\nD\t5\t0.005\nE\t6\t0.01\nF\t2\t0.120\nChild A – probability of having 1 candy = 0.015.\nChild B – probability of having 4 candies = 0.20\nAns: \n0.015+0.8+1.95+0.025+0.06+0.24 = 3.09\n\n## Q7) Calculate Mean, Median, Mode, Variance, Standard Deviation, Range \u0026 comment about the values / draw inferences, for the given dataset\n-\tFor Points, Score, Weigh\u003e\nFind Mean, Median, Mode, Variance, Standard Deviation, and Range and also Comment about the values/ Draw some inferences. \nUse Q7.csv file \nAns:\n Mean for Points = 3.59, Score = 3.21 and Weigh = 17.84\nMedian for Points = 3.69, Score = 3.32 and Weigh = 17.71\nMode for Points = 3.07, Score = 3.44 and Weigh = 17.02\nVariance for Points = 0.28, Score = 0.95, Weigh = 3.19\nStandard Deviation for Points = 0.53, Score = 0.97, Weigh = 1.78\nRange [Min-Max] for Points [3.59 – 4.93], Score [3.21 – 5.42] and Weigh [17.84 – 22.9] \nDraw Inferences \n \n## Q8) Calculate Expected Value for the problem below\na)\tThe weights (X) of patients at a clinic (in pounds), are\n108, 110, 123, 134, 135, 145, 167, 187, 199\nAssume one of the patients is chosen at random. What is the Expected Value of the Weight of that patient?\nAns: Expected value = Sum (X * Probability of X)\n= (1/9)(108)+ (1/9)(110)+ (1/9)(123)+ (1/9)(134)+ (1/9)(145)+ (1/9)(167)+ (1/9)(187)+ (1/9)(199)\n= 145.33\n\n## Q9) Calculate Skewness, Kurtosis \u0026 draw inferences on the following data\n      Cars speed and distance \nUse Q9_a.csv\nAns: \nq9a = pd.read_csv(\"C:/Users/Moin Dalvi/Documents/EXcelR Study and Assignment Material/Data Science Assignments/Basic Statistics 1/Q9_a.csv\", index_col = 'Index')\n\nprint('For Cars Speed', \"Skewness value=\", np.round(q9a.speed.skew(),2), 'and' , 'Kurtosis value=', np.round(q9a.dist.skew(),2))\nFor Cars Speed Skewness value= -0.12 and Kurtosis value= 0.81\n\nprint('Skewness value =', np.round(q9a.dist.skew(),2),'and', 'Kurtosis value =', np.round(q9a.dist.kurt(),2), 'for Cars Distance')\nSkewness value = 0.81 and Kurtosis value = 0.41 for Cars Distance\n\nSP and Weight (WT)\nUse Q9_b.csv\nAns:\nq9b =pd.read_csv(\"C:/Users/Moin Dalvi/Documents/EXcelR Study and Assignment Material/Data Science Assignments/Basic Statistics 1/Q9_b.csv\")\n\nprint('For SP Skewness =', np.round(q9b.SP.skew(),2), 'kurtosis =', np.round(q9b.WT.kurt(),2))\nFor SP Skewness = 1.61 kurtosis = 0.95\n\nprint('For WT Skewness =', np.round(q9b.SP.skew(),2), 'Kurtosis =', np.round(q9b.WT.kurt(),2))\nFor WT Skewness = 1.61 Kurtosis = 0.95\n\n## Q10) Draw inferences about the following boxplot \u0026 histogram\n\n \nAns:  The histograms peak has right skew and tail is on right. Mean \u003e Median. We have outliers on the higher side. \n \nAns: The boxplot has outliers on the maximum side.\n\n## Q11) Suppose we want to estimate the average weight of an adult male in    Mexico. We draw a random sample of 2,000 men from a population of 3,000,000 men and weigh them. We find that the average person in our sample weighs 200 pounds, and the standard deviation of the sample is 30 pounds. Calculate 94%,98%,96% confidence interval?\nAns:\nconf_94 =stats.t.interval(alpha = 0.94, df=1999, loc=200, scale=30/np.sqrt(2000))\nprint(np.round(conf_94,0))\nprint(conf_94)\nFor 94% confidence interval Range is [ 198.73 – 201.26] \nFor 98% confidence interval range is [198.43 – 201.56] \nFor 96% confidence interval range is [198.62 – 201.37]\n\n## Q12) Below are the scores obtained by a student in tests \n34,36,36,38,38,39,39,40,40,41,41,41,41,42,42,45,49,56\n\n1)\tFind mean, median, variance, standard deviation.\nAns: Mean =41, Median =40.5, Variance =25.52 and Standard Deviation =5.05\n\n2)\tWhat can we say about the student marks? \nAns: we don’t have outliers and the data is slightly skewed towards right because mean is greater than median. \n\n## Q13) What is the nature of skewness when mean and median of data are equal?\nAns: No skewness is present we have a perfect symmetrical distribution\n\n## Q14) What is the nature of skewness when mean \u003e median?\nAns: Skewness and tail is towards Right \n\n## Q15) What is the nature of skewness when median \u003e mean?\nAns: Skewness and tail is towards left\n\n## Q16) What does positive kurtosis value indicates for a data?\nAns: Positive kurtosis means the curve is more peaked and it is Leptokurtic\n\n## Q17) What does negative kurtosis value indicates for a data?\nAns: Negative Kurtosis means the curve will be flatter and broader\n\n## Q18) Answer the below questions using the below boxplot visualization.\n \n+ What can we say about the distribution of the data?\nAns: The above Boxplot is not normally distributed the median is towards the higher value\n\n+ What is nature of skewness of the data?\nAns: The data is a skewed towards left. The whisker range of minimum value is greater than maximum \n\n+ What will be the IQR of the data (approximately)? \nAns: The Inter Quantile Range = Q3 Upper quartile – Q1 Lower Quartile = 18 – 10 =8\n\n\n## Q19) Comment on the below Boxplot visualizations? \n \nDraw an Inference from the distribution of data for Boxplot 1 with respect Boxplot 2.\nAns: First there are no outliers. Second both the box plot shares the same median that is approximately in a range between 275 to 250 and they are normally distributed with zero to no skewness neither at the minimum or maximum whisker range.\n\n\n\n## Q20) Calculate probability from the given dataset for the below cases\n\nData _set: Cars.csv\nCalculate the probability of MPG of Cars for the below cases.\n       MPG \u003c- Cars $ MPG\na.\tP(MPG\u003e38)\nAns: Prob_MPG_greater_than_38 = np.round(1 - stats.norm.cdf(38, loc= q20.MPG.mean(), scale= q20.MPG.std()),3)\nprint('P(MPG\u003e38)=',Prob_MPG_greater_than_38)\n\nP(MPG\u003e38)= 0.348\n\nb.\tP(MPG\u003c40)\nAns: prob_MPG_less_than_40 = np.round(stats.norm.cdf(40, loc = q20.MPG.mean(), scale = q20.MPG.std()),3)\nprint('P(MPG\u003c40)=',prob_MPG_less_than_40)\n\nP(MPG\u003c40)= 0.729\n\nc.\tP (20\u003cMPG\u003c50)\nAns: prob_MPG_greater_than_20 = np.round(1-stats.norm.cdf(20, loc = q20.MPG.mean(), scale = q20.MPG.std()),3)\nprint('p(MPG\u003e20)=',(prob_MPG_greater_than_20))\np(MPG\u003e20)= 0.943\n\nprob_MPG_less_than_50 = np.round(stats.norm.cdf(50, loc = q20.MPG.mean(), scale = q20.MPG.std()),3)\nprint('P(MPG\u003c50)=',(prob_MPG_less_than_50))\nP(MPG\u003c50)= 0.956\n\nprob_MPG_greaterthan20_and_lessthan50= (prob_MPG_less_than_50) - (prob_MPG_greater_than_20)\nprint('P(20\u003cMPG\u003c50)=',(prob_MPG_greaterthan20_and_lessthan50))\nP(20\u003cMPG\u003c50)= 0.013000000000000012\n\n\n\n## Q21) Check whether the data follows normal distribution\na)\tCheck whether the MPG of Cars follows Normal Distribution \n        Dataset: Cars.csv\nAns: \na.)\tMPG of cars follows normal distribution \n \n\n\nb)\tCheck Whether the Adipose Tissue (AT) and Waist Circumference (Waist) from wc-at data set follows Normal Distribution \n       Dataset: wc-at.csv\nAns:  Adipose Tissue (AT) and Waist does not follow Normal Distribution\n \n \n\n## Q22) Calculate the Z scores of 90% confidence interval,94% confidence interval, 60% confidence interval \nAns: \n# z value for 90% confidence interval\nprint('Z score for 60% Conifidence Intervla =',np.round(stats.norm.ppf(.05),4))\nZ score for 60% Conifidence Intervla = -1.6449\n\n# z value for 94% confidence interval\nprint('Z score for 60% Conifidence Intervla =',np.round(stats.norm.ppf(.03),4))\nZ score for 60% Conifidence Intervla = -1.8808\n\n# z value for 60% confidence interval\nprint('Z score for 60% Conifidence Intervla =',np.round(stats.norm.ppf(.2),4))\nZ score for 60% Conifidence Intervla = -0.8416\n\n            Q 23) Calculate the t scores of 95% confidence interval, 96% confidence interval, 99% confidence interval for sample size of 25\nAns: \n# t score for 95% confidence interval\nprint('T score for 95% Confidence Interval =',np.round(stats.t.ppf(0.025,df=24),4))\nT score for 95% Confidence Interval = -2.0639\n\n\n# t value for 94% confidence interval\nprint('T score for 94% Confidence Inteval =',np.round(stats.t.ppf(0.03,df=24),4))\nT score for 94% Confidence Inteval = -1.974\n\n# t value for 99% Confidence Interval\nprint('T score for 95% Confidence Interval =',np.round(stats.t.ppf(0.005,df=24),4))\nT score for 95% Confidence Interval = -2.7969\n\n##  Q24)   A Government company claims that an average light bulb lasts 270 days. A researcher randomly selects 18 bulbs for testing. The sampled bulbs last an average of 260 days, with a standard deviation of 90 days. If the CEO's claim were true, what is the probability that 18 randomly selected bulbs would have an average life of no more than 260 days\nHint:  \n   rcode   pt(tscore,df)  \n df  degrees of freedom\nAns: import numpy as np\nImport scipy as stats\n t_score = (x - pop mean) / (sample standard daviation / square root of sample size)\n               (260-270)/90/np.sqrt(18))\n    t_score = -0.471\nstats.t.cdf(t_score, df = 17)\n0.32 = 32%\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmoindalvs%2Fassignment-basic-stats-level1","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmoindalvs%2Fassignment-basic-stats-level1","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmoindalvs%2Fassignment-basic-stats-level1/lists"}