An open API service indexing awesome lists of open source software.

https://github.com/kimrass/kimrass


https://github.com/kimrass/kimrass

business-intelligence data-analysis data-collection image-inpainting scene-text-detection tableau textual-attribute-recognition

Last synced: 22 days ago
JSON representation

Awesome Lists containing this project

README

        

# 1. Personal Projects

## 1) From-scratch PyTorch Implementations of AI papers
|연도|논문|내용|
|:-:|:-|:-|
|Vision|
|2014|[VAE](https://github.com/KimRass/VAE) (Kingma and Welling)|[✓] Training on MNIST
[✓] Visualizing Encoder output
[✓] Visualizing Decoder output
[✓] Reconstructing image|
|2015|[CAM](https://github.com/KimRass/CAM) (Zhou et al.)|[✓] Applying GoogLeNet
[✓] Generating 'Class Activatio Map'
[✓] Generating bounding box|
|2016|[Gatys et al.](https://github.com/KimRass/Gatys-et-al.-2016)|[✓] Experimenting on input image size
[✓] Experimenting on VGGNet-19 with Batch normalization
[✓] Applying VGGNet-19|
||[YOLO](https://github.com/KimRass/YOLO) (Redmon et al.)|[✓] Model architecture
[✓] Visualizing ground truth on grid
[✓] Visualizing model output
[✓] Visualizing class probability map
[ㅤ] Loss function
[ㅤ] Training on VOC 2012|
||[DCGAN](https://github.com/KimRass/DCGAN) (Radford et al.)|[✓] Training on CelebA at 64 × 64
[✓] Sampling
[✓] Interpolating in latent space
[ㅤ] Training on CelebA at 32 × 32|
||[Noroozi et al.](https://github.com/KimRass/Mehdi-Noroozi-et-al.-2016)|[✓] Model architecture
[✓] Chromatic aberration
[✓] Permutation set|
||[Zhang et al.](https://github.com/KimRass/Richard-Zhang-et-al.-2016)|[✓] Visualizing empirical probability distribution
[ㅤ] Model architecture
[ㅤ] Loss function
[ㅤ] Training|
|2014
2017|[Conditional GAN](https://github.com/KimRass/Conditional-WGAN-GP) (Mirza et al.)
[WGAN-GP](https://github.com/KimRass/Conditional-WGAN-GP) (Gulrajani et al.)|[✓] Training on MNIST|
|2016
2017|[VQ-VAE](https://github.com/KimRass/VQ-VAE-PixelCNN) (Oord et al.)
[PixelCNN](https://github.com/KimRass/VQ-VAE-PixelCNN) (Oord et al.)|[✓] Training on Fashion MNIST
[✓] Training on CIFAR-10
[✓] Sampling|
|2017|[Pix2Pix](https://github.com/KimRass/Pix2Pix) (Isola et al.)|[✓] Experimenting on image mean and std
[✓] Experimenting on `nn.InstanceNorm2d()`
[✓] Training on Google Maps
[✓] Training on Facades
[ㅤ] higher resolution input image|
||[CycleGAN](https://github.com/KimRass/CycleGAN) (Zhu et al.)|[✓] Experimenting on random image pairing
[✓] Experimenting on LSGANs
[✓] Training on monet2photo
[✓] Training on vangogh2photo
[✓] Training on cezanne2photo
[✓] Training on ukiyoe2photo
[✓] Training on horse2zebra
[✓] Training on summer2winter_yosemite|
|2018|[PGGAN](https://github.com/KimRass/PGGAN) (Karras et al.)|[✓] Experimenting on image mean and std
[✓] Training on CelebA-HQ at 512 × 512
[✓] Sampling|
||[DeepLabv3](https://github.com/KimRass/DeepLabv3) (Chen et al.)|[✓] Training on VOC 2012
[✓] Predicting on VOC 2012 validation set
[✓] Average mIoU
[✓] Visualizing model output|
||[RotNet](https://github.com/KimRass/RotNet) (Gidaris et al.)|[✓] Visualizing Attention map|
||[StarGAN](https://github.com/KimRass/StarGAN) (Yunjey Choi et al.)|[✓] Model architecture|
|2020|[STEFANN](https://github.com/KimRass/STEFANN) (Roy et al.)|[✓] FANnet architecture
[✓] Colornet architecture
[✓] Training FANnet on Google Fonts
[✓] Custom Google Fonts dataset
[✓] Average SSIM
[ㅤ] Training Colornet|
||[DDPM](https://github.com/KimRass/DDPM) (Ho et al.)|[✓] Training on CelebA at 32 × 32
[✓] Training on CelebA at 64 × 64
[✓] Visualizing denoising process
[✓] Sampling using linear interpolation
[✓] Sampling using coarse-to-fine interpolation|
||[DDIM](https://github.com/KimRass/DDIM) (Song et al.)|[✓] Normal sampling
[✓] Sampling using spherical linear interpolation
[✓] Sampling using grid interpolation
[✓] Truncated normal|
||[ViT](https://github.com/KimRass/ViT) (Dosovitskiy et al.)|[✓] Training on CIFAR-10
[✓] Training on CIFAR-100
[✓] Visualizing Attention map using Attention Roll-out
[✓] Visualizing position embedding similarity
[✓] Interpolating position embedding
[✓] CutOut
[✓] CutMix
[✓] Hide-and-Seek|
||[SimCLR](https://github.com/KimRass/SimCLR) (Chen et al.)|[✓] Normalized temperature-scaled cross entropy loss
[✓] Data augmentation
[✓] Pixel intensity histogram|
||[DETR](https://github.com/KimRass/DETR) (Carion et al.)|[✓] Model architecture
[ㅤ] Bipartite matching & loss
[ㅤ] Batch normalization freezing
[ㅤ] Training on COCO 2017
|2021|[Improved DDPM](https://github.com/KimRass/Improved-DDPM) (Nichol and Dhariwal)|[✓] Cosine diffusion schedule|
||[Classifier-Guidance](https://github.com/KimRass/Classifier-Guidance) (Dhariwal and Nichol)|[✓] Training on CIFAR-10
[ㅤ] AdaGN
[ㅤ] BiGGAN Upsample/Downsample
[ㅤ] Improved DDPM sampling
[ㅤ] Conditional/Unconditional models
[ㅤ] Super-resolution model
[ㅤ] Interpolation|
||[ILVR](https://github.com/KimRass/ILVR) (Choi et al.)|[✓] Sampling using single reference
[✓] Sampling using various downsampling factors
[✓] Sampling using various conditioning range|
||[SDEdit](https://github.com/KimRass/SDEdit) (Meng et al.)|[✓] User input stroke simulation
[✓] Applying CelebA at 64 × 64
[ㅤ] Total repeats.
[ㅤ] VE SDEdit.
[ㅤ] Sampling from scribble.
[ㅤ] Image editing only on masked regions.|
||[MAE](https://github.com/KimRass/MAE) (He et al.)|[✓] Model architecture for self-supervised pre-training
[✓] Model architecture for classification
[ㅤ] Self-supervised pre-training on ImageNet-1K
[ㅤ] Fine-tuning on ImageNet-1K
[ㅤ] Linear probing|
||[Copy-Paste](https://github.com/KimRass/Copy-Paste) (Ghiasi et al.)|[✓] COCO dataset processing
[✓] Large scale jittering
[✓] Copy-Paste (within mini-batch)
[✓] Visualizing data
[ㅤ] Gaussian filter|
||[ViViT](https://github.com/KimRass/ViViT) (Arnab et al.)|[✓] 'Spatio-temporal attention' architecture
[✓] 'Factorised encoder' architecture
[✓] 'Factorised self-attention' architecture|
|2022|[CFG](https://github.com/KimRass/CFG) (Ho et al.)|
|Language|
|2017|[Transformer](https://github.com/KimRass/Transformer) (Vaswani et al.)|[✓] Model architecture
[✓] Visualizing position encoding|
|2019|[BERT](https://github.com/KimRass/BERT) (Devlin et al.)|[✓] Model architecture
[✓] Masked language modeling
[✓] BookCorpus data processing
[✓] SQuAD data processing
[✓] SWAG data processing|
||[Sentence-BERT](https://github.com/KimRass/Sentence-BERT) (Reimers et al.)|[✓] Classification loss
[✓] Regression loss
[✓] Constrastive loss
[✓] STSb data processing
[✓] WikiSection data processing
[ㅤ] NLI data processing|
||[RoBERTa](https://github.com/KimRass/RoBERTa) (Liu et al.)|[✓] BookCorpus data processing
[✓] Masked language modeling
[ㅤ] BookCorpus data processing ('SEGMENT-PAIR' + NSP)
[ㅤ] BookCorpus data processing ('SENTENCE-PAIR' + NSP)
[✓] BookCorpus data processing ('FULL-SENTENCES')
[ㅤ] BookCorpus data processing ('DOC-SENTENCES')|
|2021|[Swin Transformer](https://github.com/KimRass/Swin-Transformer) (Liu et al.)|[✓] Patch partition
[✓] Patch merging
[✓] Relative position bias
[✓] Feature map padding
[✓] Self-attention in non-overlapped windows
[ㅤ] Shifted Window based Self-Attention|
|2024|[RoPE](https://github.com/KimRass/RoPE) (Su et al.)|[✓] Rotary Positional Embedding|
|Vision-Language|
|2021|[CLIP](https://github.com/KimRass/CLIP) (Radford et al.)|[✓] Training on Flickr8k + Flickr30k
[✓] Zero-shot classification on ImageNet1k (mini)
[✓] Linear classification on ImageNet1k (mini)|

## 2) [Fine-tuning 'EasyOCR' on the '공공행정문서 OCR' Dataset Provided by 'AI-Hub'](https://github.com/KimRass/train_easyocr)

## 3) [Recognizing Book Content Using the 'CLOVA OCR API'](https://github.com/KimRass/book_text_recognizer)

## 4) [A Rule-based Algorithm for Solving Edge-matching Puzzles of Arbitrary Sizes Using L2 Distance](https://github.com/KimRass/Jigsaw-Puzzle)

## 5) [A 'FastAPI'-based API for Performing Semantic Segmentation Using a 'DeepLabv3' Pretrained on the 'VOC2012' dataset](https://github.com/KimRass/FastAPI)