Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/paweljakubas/j-data-analysis

Data analysis using J
https://github.com/paweljakubas/j-data-analysis

data-analysis j machine-learning statistics

Last synced: 4 days ago
JSON representation

Data analysis using J

Awesome Lists containing this project

README

        

# Data analysis using J

by Pawel Jakubas, PhD

These are notes that cover a number of topics that I have
found fundamental to master data analysis using J language. The prerequisite
for fully comprehending the examples below is *Learning J* which is the recommended first
introductory material when learning J. The great example how beautifully and effciently J can be used in a specific domain is
a wonderful *Fractals, Visualization and J*. Besides that a list of high quality book references is specified.
The notes are supposed to be hands-on and to illustrate how efficient data analysis can be performed using J.
The topics and techniques presented reflect the author's subjective take on what is crucial to master the many tasks that are
required for powerful data analysis.

### SQL, data analysis, probability, statistics and machine learning
1. SQL Cookbook, Anthony Molinaro, Robert de Graaf, 2nd ed., O'Reilly 2021
2. SQL for Data Analysis: Advanced Techniques for Transforming Data into Insights, Cathy Tanimura, O'Reilly 2021
3. Data Analysis Techniques for Physical Scientists, Claude A. Pruneau, Cambridge University Press 2017
4. Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data, Željko Ivezić, Andrew J. Connolly, Jacob T. VanderPlas, and Alexander Gray, Updated Edition, Princeton University Press 2020
5. Information Theory, Inference, and Learning Algorithms, David J.C. MacKay, Cambridge University Press 2003
6. Introduction to Probability Models, Sheldon M. Ross, 12th ed., Academic Press 2019
7. Statistical Inference, George Casella, Roger L. Berger, 2nd ed, Cengage Learning 2001
8. Computational Statistics, Geof H. Givens and Jennifer A. Hoeting, 2nd ed, Wiley 2013
9. Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies, John D. Kelleher, Brian Mac Namee, Aoife D'Arcy, 2nd ed, MIT Press 2020
10. Statistical Rethinking: A Bayesian Course with Examples in R and STAN, Richard McElreath, 2nd ed, CRC 2020

### J language
11. Learning J. An Introduction to the J Programming Language, Roger Stokes, [https://www.jsoftware.com/help/learning/contents.htm#toc]
12. Fractals, Visualization & J, 4th ed. (2 Parts), Clifford A. Reiter 2016
13. Fifty Shades of J, Norman Thomson, [https://code.jsoftware.com/wiki/Fifty_Shades_of_J]
14. Linear algebra and random matrices using J, Pawel Jakubas, [https://github.com/paweljakubas/j-random-matrices]
15. Numerical methods using J, Pawel Jakubas, [https://github.com/paweljakubas/j-numerical-methods] (coming soon)

*Throughout the code J903 version of J lang was used.*

## Contents
### [SQL](chapters/sql.md)
### [CSV](chapters/csv.md)
### [Inverted table](chapters/inverted-table.md)
### [Gnuplot](chapters/gnuplot.md)
### [Inverted table workout and comparison with polar and pandas](chapters/workout.md)
### [Data analysis cases using SQL and J's approach]
### [Information-based learning]
### [Similarity-based learning]
### [Deep look at statistical distributions]
### [Classical inference]
### [Bayesian inference]
### [Error-based learning]
### [Deep learning]
### [Case study - Titanic]
### [Case study - Multimodal Single-Cell Integration]
### [Case study - Galaxy classification]
### [Case study - Treasury yield curves and surfaces]