https://github.com/noklam/blog_archive_fastpage
Nok's data science blog
https://github.com/noklam/blog_archive_fastpage
blog data data-science machine-learning python sceince
Last synced: about 2 months ago
JSON representation
Nok's data science blog
- Host: GitHub
- URL: https://github.com/noklam/blog_archive_fastpage
- Owner: noklam
- License: apache-2.0
- Created: 2020-03-22T07:42:31.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2023-04-12T05:33:39.000Z (about 3 years ago)
- Last Synced: 2025-05-14T10:32:34.839Z (about 1 year ago)
- Topics: blog, data, data-science, machine-learning, python, sceince
- Language: Jupyter Notebook
- Homepage: https://noklam.github.io/blog
- Size: 70.9 MB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
---
description: My learning notes. Just in time (JIT) is better than Just in Case
---
# README
## Introduction
  [](https://github.com/fastai/fastpages)
https://noklam.ml
## All things data
I am a data scientist. Recently, I find myself studying database, data structure, data pipeline way more than machine learning. To build a good model, I found the importance of writing good code to produce data with quality often triumphs a SOTA model.
Delivering the model is the job of a data scientist. Inevitably, every data scientist should somewhat be a "full-stack" data scientist.
This is a central repository for my blogs and notes
* Blog: https://noklam.ml \(Github Page\) - Usually blog or notes with code with shorter articles
* Blog: Medium \([https://medium.com/@nokknocknok](https://medium.com/@nokknocknok)\)
* GitBook \(Study notes mainly, I use Joplin to keep notes in markdown, am considering sync to Gitbook from time to time. I haven't figured out what's the best way to do so.\)
## Resource
I am generally interested in tools that increase productivity, please let me know if you have any recommendations. Here is a list of software/topics that I found useful.
### Uncertainty Estimation
[Uncertainty Quantification in Deep Learning](https://www.inovex.de/blog/uncertainty-quantification-deep-learning/)
### Visualization
[Visualization \(University of Washington\)](https://observablehq.com/collection/@uwdata/visualization-curriculum)
#### Custom Matplotlib style for Presentation \(Larger font size\)
[https://raw.githubusercontent.com/noklam/mediumnok/master/\_demo/python-viz/presentation.mplstyle](https://raw.githubusercontent.com/noklam/mediumnok/master/_demo/python-viz/presentation.mplstyle)
```text
my_style = 'https://raw.githubusercontent.com/noklam/mediumnok/master/_demo/python-viz/presentation.mplstyle'
with plt.style.context(['ggplot', my_style]):
make_scatter_plot()
make_line_plot()
```
### Useful Python Tools
* pyinstructment: for profiling python process, which is useful for optimization
* torchsnooper -> pytorch profiling, another profiling tool which is for PyTorch, no more print x.shape anymore.
* knockknock notification: A single line of code that get you notifications when your 10 hours model training finally done. No more starring at the progress bar.
* colorama: Colored printing in terminal \(cross platform\)
* Hypoehsis - Property-based testing, autogenerated input for unit-test.
**Reviewing \(any suggestions for code metric report/analysis library are welcome!\)**
* [coala](https://github.com/coala/coala) - coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.
* [radon](https://github.com/rubik/radon) - Radon is a Python tool that computes various metrics from the source code
* great\_expectations - A data validation library for python integrated with Pandas/Spark/SQL
### Syntax Highlight
* lunr.js
A catalog of various machine learning topics.
* [Graph Neural Network Basics](./#graph-neural-network-basics)
* [Understand What is the weird D-1/2LD-1/2](./#understand-what-is-the-weird-dsup-12supldsup-12sup)
* [Supplement Chinese Reading](./#supplement-chinese-reading)
* [Time Series Forecast](./#time-series-forecast)
* [Motivation](./#motivation)
* [Forecasting Methods](./#forecasting-methods)
* [Statistical Method](./#statistical-method)
* [Machine Learning](./#machine-learning)
* [Deep Neural Network](./#deep-neural-network)
* [Prediction Interval](./#prediction-interval)
* [Python Time Series Forecasting Library](./#python-time-series-forecasting-library)
* [Contribution](./#contribution)
* [Under Review](./#under-review)
## Graph Neural Network Basics
#### Understand What is the weird D-1/2LD-1/2
1. [spectral graph theory - Why Laplacian Matrix need normalization and how come the sqrt of Degree Matrix? - Mathematics Stack Exchange](https://math.stackexchange.com/questions/1113467/why-laplacian-matrix-need-normalization-and-how-come-the-sqrt-of-degree-matrix)
2. [spectral graph theory - Why Laplacian Matrix need normalization and how come the sqrt of Degree Matrix? - Mathematics Stack Exchange](https://math.stackexchange.com/questions/1113467/why-laplacian-matrix-need-normalization-and-how-come-the-sqrt-of-degree-matrix)
3. [What's the intuition behind a Laplacian matrix? I'm not so much interested in mathematical details or technical applications. I'm trying to grasp what a laplacian matrix actually represents, and what aspects of a graph it makes accessible. - Quora](https://www.quora.com/Whats-the-intuition-behind-a-Laplacian-matrix-Im-not-so-much-interested-in-mathematical-details-or-technical-applications-Im-trying-to-grasp-what-a-laplacian-matrix-actually-represents-and-what-aspects-of-a-graph-it-makes-accessible)
#### Supplement Chinese Reading
1. [Heat Diffusion](https://www.zhihu.com/question/54504471/answer/630639025)
2. [GCN use edge to agg node information](https://www.zhihu.com/question/54504471/answer/611222866)
3. [How to do batch training with GCN](https://zhuanlan.zhihu.com/p/55191463)
## Time Series Forecast
#### Motivation
While neural network has gain a lot of success in NLP and computer vision, there are relatively less changes for traditional time series forecasting. This repository aims to study the lastest practical technique for time series prediction, with either statistical method, machine learning, or deep neural network.
#### Forecasting Methods
**Statistical Method**
**Machine Learning**
**Deep Neural Network**
[Gramian Angular Field ](https://forums.fast.ai/t/time-series-sequential-data-study-group/29686/2?u=nok): Transform time series into an image and use transfer learning with CNN
### Prediction Interval
While forecasting accuracy is important, the prediction interval is also important and it is an area that the machine learning world has less focus on.
* Traditional statistical forecast \(ARIMA, ETS etc\)
* Bayesian Neural Network
* Random Forest jackknife approximation
* MCDropout \(Use Dropout at inference time as variation inference\)
* Quantile Regression
* VOGN \(Optimizer weight perturbation\)
* Random Forest jackknife approximation
#### Python Time Series Forecasting Library
[Prophet \(Facebook\)](https://github.com/facebook/prophet): Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth. It has build-in modeling for the Holiday effect.
[pyts](https://johannfaouzi.github.io/pyts/) : state-of-the-art algorithms for time-series transformation and classification
### Contribution
Feel free to send a PR or discuss by starting an issue.😁
_powered by_ [_fastpages_](https://github.com/fastai/fastpages)
fastpages allow me to blog directly in Notebook, so I don't have to worry how to convert into markdown anymore. I simple code and write.