Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/danmadeira/algoritmos-estatistica-pl-sql
Demonstração de Algoritmos de Estatística em PL/SQL
https://github.com/danmadeira/algoritmos-estatistica-pl-sql
algorithms data-analysis data-science database oracle oracle-database pl-sql statistics
Last synced: 3 days ago
JSON representation
Demonstração de Algoritmos de Estatística em PL/SQL
- Host: GitHub
- URL: https://github.com/danmadeira/algoritmos-estatistica-pl-sql
- Owner: danmadeira
- License: gpl-3.0
- Created: 2022-11-13T14:06:38.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2022-11-15T13:29:27.000Z (almost 2 years ago)
- Last Synced: 2023-10-20T07:51:35.277Z (about 1 year ago)
- Topics: algorithms, data-analysis, data-science, database, oracle, oracle-database, pl-sql, statistics
- Homepage:
- Size: 32.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
## Demonstração de Algoritmos de Estatística em PL/SQL
Esta é uma demonstração de funções que implementam alguns cálculos estatísticos. Os algoritmos que representam as fórmulas estatísticas estão escritos em uma estrutura simples e semelhante a aplicação pura das respectivas fórmulas, para deixar o código o mais limpo possível, e assim, didático.
Até o momento, estas são algumas das fórmulas estatísticas que estão presentes:
* Média Aritmética
## $$\mu = \frac{1}{n} \sum_{i=1}^{n} x_i$$* Média Aritmética Ponderada
## $$\mu_p = \frac{1}{\sum_{i=1}^{n} w_i} \sum_{i=1}^{n} w_i x_i$$* Média Geométrica
## $$\mu_g = \sqrt[n]{\prod_{i=1}^{n} x_i}$$* Média Geométrica Ponderada
## $$\mu_{gp} = \sqrt[\sum_{i=1}^{n} w_i]{\prod_{i=1}^{n} {x_i}^{w_i}}$$* Média Harmônica
## $$\mu_h = \frac{n}{\sum_{i=1}^{n} \frac{1}{x_i}}$$* Média Harmônica Ponderada
## $$\mu_{hp} = \frac{\sum_{i=1}^{n} w_i}{\sum_{i=1}^{n} \frac{w_i}{x_i}}$$* Média Quadrática
## $$\mu_q = \sqrt{\frac{1}{n} \sum_{i=1}^{n} {x_i}^2}$$* Média Quadrática Ponderada
## $$\mu_{qp} = \sqrt{\frac{1}{\sum_{i=1}^{n} w_i} \sum_{i=1}^{n} w_i{x_i}^2}$$* Média Cúbica
## $$\mu_c = \sqrt[3]{\frac{1}{n} \sum_{i=1}^{n} {x_i}^3}$$* Média Cúbica Ponderada
## $$\mu_{cp} = \sqrt[3]{\frac{1}{\sum_{i=1}^{n} w_i} \sum_{i=1}^{n} w_i {x_i}^3}$$* Média Desarmônica
## $$\mu_d = \frac{2}{\frac{1}{\frac{\sum_{i=1}^{n} x_i}{n}} + \frac{1}{\frac{{\bigl(\frac{\sum_{i=1}^{n} x_i}{n}\bigl)}^2}{\frac{n}{\sum_{i=1}^{n} \frac{1}{x_i}}}}}$$* Média Desarmônica Ponderada
## $$\mu_{dp} = \frac{2}{\frac{1}{\frac{\sum_{i=1}^{n} w_i x_i}{\sum_{i=1}^{n} w_i}} + \frac{1}{\frac{{\Bigl(\frac{\sum_{i=1}^{n} w_i x_i}{\sum_{i=1}^{n} w_i}\Bigl)}^2}{\frac{\sum_{i=1}^{n} w_i}{\sum_{i=1}^{n} \frac{w_i}{x_i}}}}}$$* Mediana
* Moda
* Desvio Absoluto Médio
## $$D_{am} = \frac{1}{n} \sum_{i=1}^{n} |x_i - \mu|$$* Desvio Absoluto Mediano
## $$D_{am} = Md(|x_i - \tilde{x}|)$$* Variância Populacional
## $$\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \mu)^2$$* Desvio Padrão Populacional
## $$\sigma = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (x_i - \mu)^2}$$* Variância Amostral
## $$s^2 = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2$$* Desvio Padrão Amostral
## $$s = \sqrt{\frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2}$$* Variância Populacional (para dados agrupados)
## $$\sigma^2 = \frac{1}{\sum_{i=1}^{n} w_i} \sum_{i=1}^{n} \bigl((x_i - \mu)^2 w_i\bigl)$$* Desvio Padrão Populacional (para dados agrupados)
## $$\sigma = \sqrt{\frac{1}{\sum_{i=1}^{n} w_i} \sum_{i=1}^{n} \bigl((x_i - \mu)^2 w_i\bigl)}$$* Variância Amostral (para dados agrupados)
## $$s^2 = \frac{1}{\sum_{i=1}^{n} w_i - 1} \sum_{i=1}^{n} \bigl((x_i - \bar{x})^2 w_i\bigl)$$* Desvio Padrão Amostral (para dados agrupados)
## $$s = \sqrt{\frac{1}{\sum_{i=1}^{n} w_i - 1} \sum_{i=1}^{n} \bigl((x_i - \bar{x})^2 w_i\bigl)}$$* Coeficiente de Variação
## $$CV = \frac{\sigma}{\mu} \times 100$$* Coeficiente de Variação (para dados agrupados)
## $$CV = \frac{\sigma}{\mu} \times 100$$* Covariância Populacional
## $$\sigma_{xy} = \frac{1}{n} \sum_{i=1}^{n} (x_i - \mu_x)(y_i - \mu_y)$$* Covariância Amostral
## $$s_{xy} = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})$$* Coeficiente de Correlação Populacional de Pearson
## $$\rho_{xy} = \frac{\sigma_{xy}}{\sigma_x \sigma_y}$$* Coeficiente de Correlação Amostral de Pearson
## $$r_{xy} = \frac{s_{xy}}{s_x s_y}$$* Somatório dos Quadrados
## $$SS_x = \sum_{i=1}^{n} {x_i}^2 - \frac{(\sum_{i=1}^{n} x_i)^2}{n}$$* Somatório dos Produtos XY
## $$SS_{xy} = \sum_{i=1}^{n} x_i y_i - \frac{(\sum_{i=1}^{n} x_i)(\sum_{i=1}^{n} y_i)}{n}$$* Coeficiente de Correlação de Pearson
## $$r = \frac{SS_{xy}}{\sqrt{SS_x \times SS_y}}$$* Z-score Populacional
## $$z = \frac{x - \mu}{\sigma}$$* Z-score Amostral
## $$z = \frac{x - \bar{x}}{s}$$* Três Desvios
* Amplitude
* Assimetria
## $$A = \frac{1}{n} \sum_{i=1}^{n} \Bigl(\frac{x_i - \bar{x}}{s}\Bigl)^3$$* Curtose
## $$K = \frac{1}{n} \sum_{i=1}^{n} \Bigl(\frac{x_i - \bar{x}}{s}\Bigl)^4 - 3$$* Quartis
## $$i = \frac{j(n+1)}{4}$$
## $$Q_j = x_i + \biggl(\frac{j(n+1)}{4} - i\biggl) (x_{i+1} - x_i)$$
*para j = 1, 2 e 3*Além de funções para:
* Desagrupar dados
* Agrupar dados
* Ordenar dados
* Ordenar frequênciasObs.: As funções possuem versões para dados agrupados ou não agrupados e para dados amostrais ou populacionais. Há também um script de exemplo, com demonstrações das chamadas das funções.
### Referências:
- AGRESTI, A.; FRANKLIN, C.; KLINGENBERG, B. *Statistics: The Art and Science of Learning from Data*, 4th Edition. Pearson Education Limited, 2018.
- ANDERSON, D. R.; SWEENEY, D. J.; WILLIAMS, T. A. *Essentials of Statistics for Business and Economics*, 5th Edition. Thomson South-Western, 2009.
- ANDERSON, D. R.; SWEENEY, D. J.; WILLIAMS, T. A. *Statistics for Business and Economics*, 11th Edition. South-Western, Cengage Learning, 2011.
- BALES, D. J. *Beginning Oracle PL/SQL*, 2nd Edition. Apress, 2015.
- BLUMAN, A. G. *Elementary Statistics: A Step By Step Approach*, 10th Edition. McGraw-Hill, 2018.
- BONAMENTE, M. *Statistics and Analysis of Scientific Data*, 2nd Edition. Springer Science Business Media, 2017.
- BOSLAUGH, S.; WATTERS, P. A. *Statistics in a Nutshell*, 1st Edition. O'Reilly, 2008.
- BRUCE, P.; BRUCE, A.; GEDECK, P. *Practical Statistics for Data Scientists*, 2nd Edition. O'Reilly Media, 2020.
- DANGETI, P. *Statistics for Machine Learning*. Packt Publishing, 2017.
- DEVORE, J. L. *Probability and Statistics for Engineering and the Sciences*, 9th Edition. Cengage Learning, 2016.
- EMC Education Services. *Data Science & Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data*. John Wiley & Sons, 2015.
- FORSYTH, D. *Probability and Statistics for Computer Science*. Springer International Publishing, 2018.
- FREEDMAN, D.; PISANI, R.; PURVES. R. *Statistics*, 4th Edition. W. W. Norton & Company, 2007.
- FREUND, R. J.; WILSON, W. J. *Statistical Methods*, 2nd Edition. Academic Press, 2003.
- HEUMANN, C.; SCHOMAKER, M.; SHALABH. *Introduction to Statistics and Data Analysis*. Springer International Publishing Switzerland, 2016.
- HOGG, R. V.; MCKEAN, J. W.; CRAIG, A. T. *Introduction to Mathematical Statistics*, 8th Edition. Pearson Education, 2019.
- HUBER, P. J.; RONCHETTI, E. M. *Robust Statistics*, Second Edition. John Wiley & Sons, 2009.
- JOHNSON, R.; KUBY, P. *Elementary Statistics*, 11th edition. Brooks/Cole, Cengage Learning, 2012.
- KOTZ, S.; READ, C. B.; BALAKRISHNAN, N.; VIDAKOVIC, B. *Encyclopedia of Statistical Sciences*, 2nd Edition, Vol 1-16. John Wiley & Sons, 2005.
- MARTIN, B. R. *Statistics for Physical Sciences, An Introduction*, 1st Edition. Elsevier, 2012.
- MCDONALD, C. et col. *Mastering Oracle PL/SQL: Practical Solutions*. APress Media, LLC, 2004.
- MENDENHALL, W.; BEAVER, R. J.; BEAVER, B. M. *Introduction to Probability and Statistics*, 13th Edition. Brooks/Cole, Cengage Learning, 2009.
- MENDENHALL, W. M.; SINCICH, T. L. *Statistics for Engineering and the Sciences*, 6th Edition. CRC Press, Taylor & Francis Group, 2016.
- MONTGOMERY, D. C.; RUNGER, G. C. *Applied Statistics and Probability for Engineers*, 3rd Edition. John Wiley & Sons, 2003.
- MORIN, L. *Oracle Database Database PL/SQL Language Reference, 19c*. E96448-03, Oracle, August 2020.
- MURACH, J. *Murach's Oracle SQL and PL/SQL for Developers*, 2nd Edition. Mike Murach & Associates, 2014.
- NISBET, R.; MINER, G.; YALE, K. *Handbook of Statistical Analysis and Data Mining Applications*, 2nd Edition. Academic Press, Elsevier, 2018.
- OZDEMIR, S. *Principles of Data Science: Learn the techniques and math you need to start making sense of your data*. Packt Publishing, 2016.
- PECK, R.; DEVORE, J. L. *Statistics: The Exploration and Analysis of Data*, 7th Edition. Brooks/Cole, Cengage Learning, 2012.
- PECK, R.; OLSEN, C.; DEVORE, J. L. *Introduction to Statistics and Data Analysis*, 4th Edition. Brooks/Cole, Cengage Learning, 2012.
- PROVOST, F.; FAWCETT, T. *Data Science for Business*, 1st Edition. O'Reilly Media, 2013.
- RASCH, D.; SCHOTT, D. *Mathematical Statistics*. John Wiley & Sons, 2018.
- ROSENZWEIG, B.; RAKHIMOV, E. S. *Oracle PL/SQL by example*, 4th Edition. Pearson Education, Inc., 2009.
- RUPPERT, D.; MATTESON, D. S. *Statistics and Data Analysis for Financial Engineering with R examples*, Second Edition. Springer Texts in Statistics. Springer, 2015.
- SALKIND, N. J. *Encyclopedia of Measurement and Statistics*, Vol 1-3. SAGE Publications, 2007.
- SKIENA, S. S. *The Data Science Design Manual*. Texts in Computer Science. Springer International Publishing, 2017.
- SPIEGEL, M. R.; STEPHENS, L. J. *Schaum's Outline of Statistics*, 6th Edition. McGraw-Hill Education, 2018.
- WALPOLE, R. E.; MYERS, R. H.; MYERS, S. L.; YE, K. *Probability & Statistics for Engineers & Scientists*, 9th Edition, Global Edition. Pearson Education Limited, 2016.
- WEIERS, R. M. *Introduction to Business Statistics*, 6th Edition. Thomson South-Western, 2008.
- WITTE, R. S.; WITTE, J. S. *Statistics*, 11th Edition. John Wiley & Sons, 2017.