https://github.com/kimrass/kimrass

business-intelligence data-analysis data-collection image-inpainting scene-text-detection tableau textual-attribute-recognition
Last synced: 3 months ago
JSON representation
Host: GitHub
URL: https://github.com/kimrass/kimrass
Owner: KimRass
Created: 2021-11-30T05:12:30.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-10-24T17:02:53.000Z (8 months ago)
Last Synced: 2024-10-24T18:44:24.249Z (8 months ago)
Topics: business-intelligence, data-analysis, data-collection, image-inpainting, scene-text-detection, tableau, textual-attribute-recognition
Homepage:
Size: 1.12 GB
Stars: 3
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

        # 1. Personal Projects

## 1) From-scratch PyTorch Implementations of AI papers

|연도|논문|내용|

|:-:|:-|:-|

|Vision|

|2014|[VAE](https://github.com/KimRass/VAE) (Kingma and Welling)|[✓] Training on MNIST
[✓] Visualizing Encoder output
[✓] Visualizing Decoder output
[✓] Reconstructing image|

|2015|[CAM](https://github.com/KimRass/CAM) (Zhou et al.)|[✓] Applying GoogLeNet
[✓] Generating 'Class Activatio Map'
[✓] Generating bounding box|

|2016|[Gatys et al.](https://github.com/KimRass/Gatys-et-al.-2016)|[✓] Experimenting on input image size
[✓] Experimenting on VGGNet-19 with Batch normalization
[✓] Applying VGGNet-19|

||[YOLO](https://github.com/KimRass/YOLO) (Redmon et al.)|[✓] Model architecture
[✓] Visualizing ground truth on grid
[✓] Visualizing model output
[✓] Visualizing class probability map
[ㅤ] Loss function
[ㅤ] Training on VOC 2012|

||[DCGAN](https://github.com/KimRass/DCGAN) (Radford et al.)|[✓] Training on CelebA at 64 × 64
 [✓] Sampling
[✓] Interpolating in latent space
[ㅤ] Training on CelebA at 32 × 32|

||[Noroozi et al.](https://github.com/KimRass/Mehdi-Noroozi-et-al.-2016)|[✓] Model architecture
[✓] Chromatic aberration
[✓] Permutation set|

||[Zhang et al.](https://github.com/KimRass/Richard-Zhang-et-al.-2016)|[✓]  Visualizing empirical probability distribution
[ㅤ] Model architecture
[ㅤ] Loss function
[ㅤ] Training|

|2014
2017|[Conditional GAN](https://github.com/KimRass/Conditional-WGAN-GP) (Mirza et al.)
[WGAN-GP](https://github.com/KimRass/Conditional-WGAN-GP) (Gulrajani et al.)|[✓] Training on MNIST|

|2016
2017|[VQ-VAE](https://github.com/KimRass/VQ-VAE-PixelCNN) (Oord et al.)
[PixelCNN](https://github.com/KimRass/VQ-VAE-PixelCNN) (Oord et al.)|[✓] Training on Fashion MNIST
[✓] Training on CIFAR-10
[✓] Sampling|

|2017|[Pix2Pix](https://github.com/KimRass/Pix2Pix) (Isola et al.)|[✓] Experimenting on image mean and std
[✓] Experimenting on `nn.InstanceNorm2d()`
[✓] Training on Google Maps
[✓] Training on Facades
[ㅤ] higher resolution input image|

||[CycleGAN](https://github.com/KimRass/CycleGAN) (Zhu et al.)|[✓] Experimenting on random image pairing
[✓] Experimenting on LSGANs
[✓] Training on monet2photo
[✓] Training on vangogh2photo
[✓] Training on cezanne2photo
[✓] Training on ukiyoe2photo
[✓] Training on horse2zebra
[✓] Training on summer2winter_yosemite|

|2018|[PGGAN](https://github.com/KimRass/PGGAN) (Karras et al.)|[✓] Experimenting on image mean and std
[✓] Training on CelebA-HQ at 512 × 512
[✓] Sampling|

||[DeepLabv3](https://github.com/KimRass/DeepLabv3) (Chen et al.)|[✓] Training on VOC 2012
[✓] Predicting on VOC 2012 validation set
[✓] Average mIoU
[✓] Visualizing model output|

||[RotNet](https://github.com/KimRass/RotNet) (Gidaris et al.)|[✓] Visualizing Attention map|

||[StarGAN](https://github.com/KimRass/StarGAN) (Yunjey Choi et al.)|[✓] Model architecture|

|2020|[STEFANN](https://github.com/KimRass/STEFANN) (Roy et al.)|[✓] FANnet architecture
[✓] Colornet architecture
[✓] Training FANnet on Google Fonts
[✓] Custom Google Fonts dataset
[✓] Average SSIM
[ㅤ] Training Colornet|

||[DDPM](https://github.com/KimRass/DDPM) (Ho et al.)|[✓] Training on CelebA at 32 × 32
[✓] Training on CelebA at 64 × 64
[✓] Visualizing denoising process
[✓] Sampling using linear interpolation
[✓] Sampling using coarse-to-fine interpolation|

||[DDIM](https://github.com/KimRass/DDIM) (Song et al.)|[✓] Normal sampling
[✓] Sampling using spherical linear interpolation
[✓] Sampling using grid interpolation
[✓] Truncated normal|

||[ViT](https://github.com/KimRass/ViT) (Dosovitskiy et al.)|[✓] Training on CIFAR-10
[✓] Training on CIFAR-100
[✓] Visualizing Attention map using Attention Roll-out
[✓] Visualizing position embedding similarity
[✓] Interpolating position embedding
[✓] CutOut
[✓] CutMix
[✓] Hide-and-Seek|

||[SimCLR](https://github.com/KimRass/SimCLR) (Chen et al.)|[✓] Normalized temperature-scaled cross entropy loss
[✓] Data augmentation
[✓] Pixel intensity histogram|

||[DETR](https://github.com/KimRass/DETR) (Carion et al.)|[✓] Model architecture
[ㅤ] Bipartite matching & loss
[ㅤ] Batch normalization freezing
[ㅤ] Training on COCO 2017

|2021|[Improved DDPM](https://github.com/KimRass/Improved-DDPM) (Nichol and Dhariwal)|[✓] Cosine diffusion schedule|

||[Classifier-Guidance](https://github.com/KimRass/Classifier-Guidance) (Dhariwal and Nichol)|[✓] Training on CIFAR-10
[ㅤ] AdaGN
[ㅤ] BiGGAN Upsample/Downsample
[ㅤ] Improved DDPM sampling
[ㅤ] Conditional/Unconditional models
[ㅤ] Super-resolution model
[ㅤ] Interpolation|

||[ILVR](https://github.com/KimRass/ILVR) (Choi et al.)|[✓] Sampling using single reference
[✓] Sampling using various downsampling factors
[✓] Sampling using various conditioning range|

||[SDEdit](https://github.com/KimRass/SDEdit) (Meng et al.)|[✓] User input stroke simulation
[✓] Applying CelebA at 64 × 64
[ㅤ] Total repeats.
[ㅤ] VE SDEdit.
[ㅤ] Sampling from scribble.
[ㅤ] Image editing only on masked regions.|

||[MAE](https://github.com/KimRass/MAE) (He et al.)|[✓] Model architecture for self-supervised pre-training
[✓] Model architecture for classification
[ㅤ] Self-supervised pre-training on ImageNet-1K
[ㅤ] Fine-tuning on ImageNet-1K
[ㅤ] Linear probing|

||[Copy-Paste](https://github.com/KimRass/Copy-Paste) (Ghiasi et al.)|[✓] COCO dataset processing
[✓] Large scale jittering
[✓] Copy-Paste (within mini-batch)
[✓] Visualizing data
[ㅤ] Gaussian filter|

||[ViViT](https://github.com/KimRass/ViViT) (Arnab et al.)|[✓] 'Spatio-temporal attention' architecture
[✓] 'Factorised encoder' architecture
[✓] 'Factorised self-attention' architecture|

|2022|[CFG](https://github.com/KimRass/CFG) (Ho et al.)|

|Language|

|2017|[Transformer](https://github.com/KimRass/Transformer) (Vaswani et al.)|[✓] Model architecture
[✓] Visualizing position encoding|

|2019|[BERT](https://github.com/KimRass/BERT) (Devlin et al.)|[✓] Model architecture
[✓] Masked language modeling
[✓] BookCorpus data processing
[✓] SQuAD data processing
[✓] SWAG data processing|

||[Sentence-BERT](https://github.com/KimRass/Sentence-BERT) (Reimers et al.)|[✓] Classification loss
[✓] Regression loss
[✓] Constrastive loss
[✓] STSb data processing
[✓] WikiSection data processing
[ㅤ] NLI data processing|

||[RoBERTa](https://github.com/KimRass/RoBERTa) (Liu et al.)|[✓] BookCorpus data processing
[✓] Masked language modeling
[ㅤ] BookCorpus data processing ('SEGMENT-PAIR' + NSP)
[ㅤ] BookCorpus data processing ('SENTENCE-PAIR' + NSP)
[✓] BookCorpus data processing ('FULL-SENTENCES')
[ㅤ] BookCorpus data processing ('DOC-SENTENCES')|

|2021|[Swin Transformer](https://github.com/KimRass/Swin-Transformer) (Liu et al.)|[✓] Patch partition
[✓] Patch merging
[✓] Relative position bias
[✓] Feature map padding
[✓] Self-attention in non-overlapped windows
[ㅤ] Shifted Window based Self-Attention|

|2024|[RoPE](https://github.com/KimRass/RoPE) (Su et al.)|[✓] Rotary Positional Embedding|

|Vision-Language|

|2021|[CLIP](https://github.com/KimRass/CLIP) (Radford et al.)|[✓] Training on Flickr8k + Flickr30k
[✓] Zero-shot classification on ImageNet1k (mini)
[✓] Linear classification on ImageNet1k (mini)|

## 2) [Fine-tuning 'EasyOCR' on the '공공행정문서 OCR' Dataset Provided by 'AI-Hub'](https://github.com/KimRass/train_easyocr)

## 3) [Recognizing Book Content Using the 'CLOVA OCR API'](https://github.com/KimRass/book_text_recognizer)

## 4) [A Rule-based Algorithm for Solving Edge-matching Puzzles of Arbitrary Sizes Using L2 Distance](https://github.com/KimRass/Jigsaw-Puzzle)

## 5) [A 'FastAPI'-based API for Performing Semantic Segmentation Using a 'DeepLabv3' Pretrained on the 'VOC2012' dataset](https://github.com/KimRass/FastAPI)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kimrass/kimrass

Awesome Lists containing this project

README