{"id":19541444,"url":"https://github.com/genesisblock3301/probability_statistics_and_machine_learning","last_synced_at":"2025-02-26T05:18:33.547Z","repository":{"id":117979649,"uuid":"595059231","full_name":"GenesisBlock3301/probability_statistics_and_machine_learning","owner":"GenesisBlock3301","description":"This repo for learning ML related concept and tools","archived":false,"fork":false,"pushed_at":"2024-05-07T09:38:46.000Z","size":4639,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-08T18:45:33.247Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GenesisBlock3301.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-01-30T10:03:14.000Z","updated_at":"2024-05-07T09:38:49.000Z","dependencies_parsed_at":"2024-05-07T10:54:00.942Z","dependency_job_id":null,"html_url":"https://github.com/GenesisBlock3301/probability_statistics_and_machine_learning","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GenesisBlock3301%2Fprobability_statistics_and_machine_learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GenesisBlock3301%2Fprobability_statistics_and_machine_learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GenesisBlock3301%2Fprobability_statistics_and_machine_learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GenesisBlock3301%2Fprobability_statistics_and_machine_learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GenesisBlock3301","download_url":"https://codeload.github.com/GenesisBlock3301/probability_statistics_and_machine_learning/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240795196,"owners_count":19858765,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-11T03:10:31.810Z","updated_at":"2025-02-26T05:18:33.493Z","avatar_url":"https://github.com/GenesisBlock3301.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Learning Roadmap of Probability and Statistics\n\n### Statistics roadmap for ML:\nProbability theory:\n- Probability\n- Random variables\n- Probability distributions\n- Conditional probability is crucial for modeling uncertainty in ML.\n\nDescriptive statistics:\n- Measures of central tendency (mean, median, mode)\n- measures of dispersion (variance, standard deviation)\n\nInferential statistics:\n- Hypothesis testing\n- Confidence intervals\n- P-values are essential for making inferences and drawing conclusions from data samples.\n\nRegression analysis:\n- Linear regression and its variants are widely used in ML for modeling relationships between variables and making predictions.\n\nProbability distributions:\n- Gaussian (normal) distribution\n- Binomial distribution\n- Poisson's distribution is beneficial for understanding the behavior of data and modeling assumptions.\n\nSampling techniques:\n- Understanding different sampling techniques, such as random sampling and stratified sampling, is important for collecting representative training and test datasets.\n\nStatistical hypothesis testing:\n- Knowing how to perform hypothesis tests, interpret the results\n- Make decisions based on statistical significance is crucial for evaluating ML models.\n\nStatistical modeling: Knowledge of techniques like\n- maximum likelihood estimation (MLE),\n- Bayesian inference can be helpful for parameter estimation and building probabilistic models.\n\nExperimental design:\n- Understanding principles of experimental design, such as randomization,\n  control groups, and factorial designs, helps in conducting rigorous experiments and A/B testing in ML.\n\nMultivariate statistics:\n- Techniques like principal component analysis (PCA), factor analysis\n- Cluster analysis provide tools for dimensionality reduction, feature selection\n- Pattern recognition in high-dimensional datasets.\n\n**Exploratory data analysis**\n1. **Scatter plot.**\n2. **Pair Plot.**\n3. **Histogram**\n4. **Cumulative Distribution**\n5. **Mean and Standard Deviation**\n6. **Median, Percentile, Quantile**\n7. MAD, Box plot and Voilin Plot\n-------\n8. EDA on Cancer Dataset\n9. Gaussian or Normal distribution\n10. Skewness and Kurtosis\n11. Sampling Distribution \u0026 Standard Normal Variate(z) and Standardization\n12. Quantile quantile plot\n13. Chebyshev's inequality\n14. Uniform Distribution\n15. Bernoulli Vs Binomial VS Normal VS Pareto Distribute.\n16. Box Cox Transformation\n17. Covariance Statistics\n18. Pearson Correlation\n19. Spearman rank Correlation Coefficient\n20. Correlation VS Causation and confidence interval.\n21. Confidence Interval with underlying or Gaussian Distribution.\n22. Hypothesis testing and P value statistics.\n23. T test vs Chi Square test VS Anova test\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgenesisblock3301%2Fprobability_statistics_and_machine_learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgenesisblock3301%2Fprobability_statistics_and_machine_learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgenesisblock3301%2Fprobability_statistics_and_machine_learning/lists"}