{"id":21538934,"url":"https://github.com/yassouali/ml-paper-notes","last_synced_at":"2026-02-17T21:01:12.873Z","repository":{"id":46168306,"uuid":"194685842","full_name":"yassouali/ML-paper-notes","owner":"yassouali","description":":notebook: Notes and summaries of various ML, Computer Vision \u0026 NLP papers.","archived":false,"fork":false,"pushed_at":"2022-05-02T16:40:21.000Z","size":96484,"stargazers_count":565,"open_issues_count":0,"forks_count":79,"subscribers_count":39,"default_branch":"master","last_synced_at":"2026-01-23T12:47:16.128Z","etag":null,"topics":["computer-vision","deep-learning","machine-learning","natural-language-processing","nlp","summary"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yassouali.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-07-01T14:12:57.000Z","updated_at":"2026-01-09T14:48:05.000Z","dependencies_parsed_at":"2022-09-06T07:31:27.420Z","dependency_job_id":null,"html_url":"https://github.com/yassouali/ML-paper-notes","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/yassouali/ML-paper-notes","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yassouali%2FML-paper-notes","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yassouali%2FML-paper-notes/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yassouali%2FML-paper-notes/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yassouali%2FML-paper-notes/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yassouali","download_url":"https://codeload.github.com/yassouali/ML-paper-notes/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yassouali%2FML-paper-notes/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29558100,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-17T20:52:40.164Z","status":"ssl_error","status_checked_at":"2026-02-17T20:48:10.325Z","response_time":100,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","deep-learning","machine-learning","natural-language-processing","nlp","summary"],"created_at":"2024-11-24T04:13:45.890Z","updated_at":"2026-02-17T21:01:12.846Z","avatar_url":"https://github.com/yassouali.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n# ML Papers\nThis repo contains notes and short summaries of some ML related papers I come across, organized by subjects and the summaries are in the form of PDFs.\n\n## Self-Supervised \u0026 Contrastive Learning\n\n- Self-Supervised Relational Reasoning for Representation Learning (2020): [[Paper]](https://arxiv.org/abs/2006.05849) [[Notes]](notes/103_SSL_relation_reasoning.pdf)\n- Big Self-Supervised Models are Strong Semi-Supervised Learners (2020) [[Paper]](https://arxiv.org/abs/2006.10029) [[Notes]](notes/95_big_self-supervised_models.pdf)\n- Debiased Contrastive Learning (2020) [[Paper]](https://arxiv.org/abs/2007.00224) [[Notes]](notes/97_debiased_contrastive_learning.pdf)\n- Selfie: Self-supervised Pretraining for Image Embedding (2019): [[Paper]](https://arxiv.org/abs/1906.02940) [[Notes]](notes/76_selfie_pretraining_for_img_embeddings.pdf)\n- Self-Supervised Representation Learning by Rotation Feature Decoupling (2019): [[Paper]](http://openaccess.thecvf.com/content_CVPR_2019/papers/Feng_Self-Supervised_Representation_Learning_by_Rotation_Feature_Decoupling_CVPR_2019_paper.pdf) [[Notes]](notes/73_SSL_by_rotation_decoupling.pdf)\n- Revisiting Self-Supervised Visual Representation Learning (2019): [[Paper]](https://arxiv.org/abs/1901.09005) [[Notes]](notes/72_revisiting_SSL.pdf)\n- AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations (2019): [[Paper]](https://arxiv.org/abs/1901.04596) [[Notes]](notes/74_AFT_vs_AED.pdf)\n- Boosting Self-Supervised Learning via Knowledge Transfer (2018): [[Paper]](https://arxiv.org/abs/1805.00385) [[Notes]](notes/67_boosting_self_super_via_trsf_learning.pdf)\n- Self-Supervised Feature Learning by Learning to Spot Artifacts (2018): [[Paper]](https://arxiv.org/abs/1806.05024) [[Notes]](notes/69_SSL_by_learn_to_spot_artifacts.pdf)\n- Unsupervised Representation Learning by Predicting Image Rotations (2018): [[Paper]](https://arxiv.org/abs/1803.07728) [[Notes]](notes/68_unsup_img_rep_learn_by_rot_predic.pdf)\n- Cross Pixel Optical-Flow Similarity for Self-Supervised Learning (2018): [[Paper]](https://arxiv.org/abs/1807.05636) [[Notes]](notes/75_cross_pixel_optical_flow.pdf)\n- Multi-task Self-Supervised Visual Learning (2017): [[Paper]](https://arxiv.org/abs/1708.07860) [[Notes]](notes/64_multi_task_self_supervised.pdf)\n- Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction (2017): [[Paper]](https://arxiv.org/abs/1611.09842) [[Notes]](notes/65_split_brain_autoencoders.pdf)\n- Colorization as a Proxy Task for Visual Understanding (2017): [[Paper]](https://arxiv.org/abs/1703.04044) [[Notes]](notes/66_colorization_as_a_proxy_for_viz_under.pdf)\n- Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles (2017): [[Paper]](https://arxiv.org/abs/1603.09246) [[Notes]](notes/63_solving_jigsaw_puzzles.pdf)\n- Unsupervised Visual Representation Learning by Context Prediction (2016): [[Paper]](https://arxiv.org/abs/1505.05192) [[Notes]](notes/62_unsupervised_learning_with_context_prediction.pdf)\n- Colorful image colorization (2016): [[Paper]](https://richzhang.github.io/colorization/) [[Notes]](notes/59_colorful_colorization.pdf)\n- Learning visual groups from co-occurrences in space and time (2015): [[Paper]](https://arxiv.org/abs/1511.06811) [[Notes]](notes/61_visual_groups_from_co_occurrences.pdf)\n- Discriminative unsupervised feature learning with exemplar convolutional neural networks (2015): [[Paper]](https://arxiv.org/abs/1406.6909) [[Notes]](notes/60_exemplar_CNNs.pdf)\n\n## Semi-Supervised Learning\n\n- Negative sampling in semi-supervised learning (2020): [[Paper]](https://arxiv.org/abs/1911.05166) [[Notes]](notes/94_nagative_sampling_SSL.pdf)\n- Time-Consistent Self-Supervision for Semi-Supervised Learning (2020): [[Paper]](https://proceedings.icml.cc/static/paper_files/icml/2020/5578-Paper.pdf) [[Notes]](notes/93_time_consistent_SSL.pdf)\n- Dual Student: Breaking the Limits of the Teacher in Semi-supervised Learning (2019): [[Paper]](https://arxiv.org/abs/1909.01804) [[Notes]](notes/79_dual_student.pdf)\n- S4L: Self-Supervised Semi-Supervised Learning (2019): [[Paper]](https://arxiv.org/abs/1905.03670) [[Notes]](notes/83_S4L.pdf)\n- Semi-Supervised Learning by Augmented Distribution Alignment (2019): [[Paper]](https://arxiv.org/abs/1905.08171) [[Notes]](notes/80_SSL_aug_dist_align.pdf)\n- MixMatch: A Holistic Approach toSemi-Supervised Learning (2019): [[Paper]](https://arxiv.org/abs/1905.02249) [[Notes]](notes/45_mixmatch.pdf)\n- Unsupervised Data Augmentation (2019): [[Paper]](https://arxiv.org/abs/1904.12848) [[Notes]](notes/39_unsupervised_data_aug.pdf)\n- Interpolation Consistency Training for Semi-Supervised Learning (2019): [[Paper]](https://arxiv.org/abs/1903.03825) [[Notes]](notes/44_interpolation_consistency_tranining.pdf)\n- Deep Co-Training for Semi-Supervised Image Recognition (2018): [[Paper]](https://arxiv.org/abs/1803.05984) [[Notes]](notes/46_deep_co_training_img_rec.pdf)\n- Unifying semi-supervised and robust learning by mixup (2019): [[Paper]](https://openreview.net/forum?id=r1gp1jRN_4) [[Notes]](notes/42_mixmixup.pdf)\n- Realistic Evaluation of Deep Semi-Supervised Learning Algorithms (2018): [[Paper]](https://arxiv.org/abs/1804.09170) [[Notes]](notes/37_realistic_eval_of_deep_ss.pdf)\n- Semi-Supervised Sequence Modeling with Cross-View Training (2018): [[Paper]](https://arxiv.org/abs/1809.08370) [[Notes]](notes/38_cross_view_semi_supervised.pdf)\n- Virtual Adversarial Training (2017): [[Paper]](https://arxiv.org/abs/1704.03976) [[Notes]](notes/40_virtual_adversarial_training.pdf)\n- Mean teachers are better role models (2017): [[Paper]](https://arxiv.org/abs/1703.01780) [[Notes]](notes/56_mean_teachers.pdf)\n- Temporal Ensembling for Semi-Supervised Learning (2017): [[Paper]](https://arxiv.org/abs/1610.02242) [[Notes]](notes/55_temporal-ensambling.pdf)\n- Semi-Supervised Learning with Ladder Networks (2015): [[Paper]](https://arxiv.org/abs/1507.02672) [[Notes]](notes/33_ladder_nets.pdf)\n\n## Video Understanding\n\n\n\n- Multiscale Vision Transformers (2021): [[Paper]](https://arxiv.org/abs/2104.11227) [[Notes]](notes/Multiscale_Vision_Transformers.pdf)\n- ViViT A Video Vision Transformer (2021): [[Paper]](https://arxiv.org/abs/2103.15691) [[Notes]](notes/ViViT_A_Video_Vision_Transformer.pdf)\n- Space-time Mixing Attention for Video Transformer (2021): [[Paper]](https://arxiv.org/abs/2106.05968) [[Notes]](notes/Space-time_Mixing_Attention_for_Video_Transformer.pdf)\n- Is Space-Time Attention All You Need for Video Understanding (2021): [[Paper]](https://arxiv.org/abs/2102.05095) [[Notes]](notes/Is_Space-Time_Attention_All_You_Need_for_Video_Understanding.pdf)\n- An Image is Worth 16x16 Words What is a Video Worth (2021): [[Paper]](https://arxiv.org/abs/2103.13915) [[Notes]](notes/An_Image_is_Worth_16x16_Words_What_is_a_Video_Worth.pdf)\n- Temporal Query Networks for Fine-grained Video Understanding (2021): [[Paper]](https://arxiv.org/abs/2104.09496) [[Notes]](notes/Temporal_Query_Networks_for_Fine-grained_Video_Understanding.pdf)\n- X3D Expanding Architectures for Efficient Video Recognition (2020): [[Paper]](https://arxiv.org/abs/2004.04730) [[Notes]](notes/X3D_Expanding_Architectures_for_Efficient_Video_Recognition.pdf)\n- Temporal Pyramid Network for Action Recognition (2020): [[Paper]](https://arxiv.org/abs/2004.03548) [[Notes]](notes/Temporal_Pyramid_Network_for_Action_Recognition.pdf)\n- STM SpatioTemporal and Motion Encoding for Action Recognition (2019): [[Paper]](https://arxiv.org/abs/1908.02486) [[Notes]](notes/STM_SpatioTemporal_and_Motion_Encoding_for_Action_Recognition.pdf)\n- Video Classification with Channel-Separated Convolutional Networks (2019): [[Paper]](https://arxiv.org/abs/1904.02811) [[Notes]](notes/Video_Classification_with_Channel-Separated_Convolutional_Networks.pdf)\n- Video Modeling with Correlation Networks (2019): [[Paper]](https://arxiv.org/abs/1906.03349) [[Notes]](notes/Video_Modeling_with_Correlation_Networks.pdf)\n- Videos as Space-Time Region Graphs (2018): [[Paper]](https://arxiv.org/abs/1806.01810) [[Notes]](notes/Videos_as_Space-Time_Region_Graphs.pdf)\n- SlowFast Networks for Video Recognition (2018): [[Paper]](https://arxiv.org/abs/1812.03982) [[Notes]](notes/SlowFast_Networks_for_Video_Recognition.pdf)\n- TSM Temporal Shift Module for Efficient Video Understanding (2018): [[Paper]](https://arxiv.org/abs/1811.08383) [[Notes]](notes/TSM_Temporal_Shift_Module_for_Efficient_Video_Understanding.pdf)\n- Timeception for Complex Action Recognition (2018): [[Paper]](https://arxiv.org/abs/1812.01289) [[Notes]](notes/Timeception_for_Complex_Action_Recognition.pdf)\n- Non-local Neural Networks (2017): [[Paper]](https://arxiv.org/abs/1711.07971) [[Notes]](notes/Non-local_Neural_Networks.pdf)\n- Temporal Segment Networks for Action Recognition in Videos. (2017): [[Paper]](https://arxiv.org/abs/1705.02953) [[Notes]](notes/Temporal_Segment_Networks_for_Action_Recognition_in_Videos..pdf)\n- Quo Vadis Action Recognition A New Model and the Kinetics Dataset (2017): [[Paper]](https://arxiv.org/abs/1705.07750) [[Notes]](notes/Quo_Vadis_Action_Recognition_A_New_Model_and_the_Kinetics_Dataset.pdf)\n- A Closer Look at Spatiotemporal Convolutions for Action Recognition (2017): [[Paper]](https://arxiv.org/abs/1711.11248) [[Notes]](notes/A_Closer_Look_at_Spatiotemporal_Convolutions_for_Action_Recognition.pdf)\n- ActionVLAD Learning spatio-temporal aggregation for action classification (2017): [[Paper]](https://arxiv.org/abs/1704.02895) [[Notes]](notes/ActionVLAD_Learning_spatio-temporal_aggregation_for_action_classification.pdf)\n- Spatiotemporal Residual Networks for Video Action Recognition (2016): [[Paper]](https://arxiv.org/abs/1611.02155) [[Notes]](notes/Spatiotemporal_Residual_Networks_for_Video_Action_Recognition.pdf)\n- Deep Temporal Linear Encoding Networks (2016): [[Paper]](https://arxiv.org/abs/1611.06678) [[Notes]](notes/Deep_Temporal_Linear_Encoding_Networks.pdf)\n- Temporal Convolutional Networks for Action Segmentation and Detection (2016): [[Paper]](https://arxiv.org/abs/1611.05267) [[Notes]](notes/Temporal_Convolutional_Networks_for_Action_Segmentation_and_Detection.pdf)\n- Learning Spatiotemporal Features with 3D Convolutional Network (2014): [[Paper]](https://arxiv.org/abs/1412.0767) [[Notes]](notes/Learning_Spatiotemporal_Features_with_3D_Convolutional_Network.pdf)\n\n\n## Domain Adaptation, Domain \u0026 Out-of-Distribution Generalization\n\n- Rethinking Distributional Matching Based Domain Adaptation (2020): [[Paper]](https://arxiv.org/abs/2006.13352) [[Notes]](notes/98_rethinking_distributional_matching.pdf)\n- Transferability vs. Discriminability: Batch Spectral Penalization (2019): [[Paper]](http://proceedings.mlr.press/v97/chen19i.html) [[Notes]](notes/91_batch_spectral_normalization.pdf)\n- On Learning Invariant Representations for Domain Adaptation (2019): [[Paper]](https://arxiv.org/abs/1901.09453) [[Notes]](notes/90_on_learning_invariant_repr.pdf)\n- Universal Domain Adaptation (2019): [[Paper]](http://openaccess.thecvf.com/content_CVPR_2019/papers/You_Universal_Domain_Adaptation_CVPR_2019_paper.pdf) [[Notes]](notes/87_Universal_DA.pdf)\n- Transferable Adversarial Training (2019): [[Paper]](http://proceedings.mlr.press/v97/liu19b/liu19b.pdf) [[Notes]](notes/86_TDT.pdf)\n- Multi-Adversarial Domain Adaptation (2018): [[Paper]](https://arxiv.org/abs/1809.02176) [[Notes]](notes/92_multi_adversarial_domain_adaptation.pdf)\n- Conditional Adversarial Domain Adaptation (2018): [[Paper]](https://arxiv.org/abs/1705.10667) [[Notes]](notes/85_CDAN.pdf)\n- Learning Adversarially Fair and Transferable Representations (2018): [[Paper]](https://arxiv.org/abs/1802.06309) [[Notes]](notes/88_learning_adv_fair_and_tsf_repres.pdf)\n- What is the Effect of Importance Weighting in Deep Learning? (2018): [[Paper]](https://arxiv.org/abs/1812.03372) [[Notes]](notes/89_effect_of_importance_weighting.pdf)\n\n## Explainability\n\n- Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU Models (2021): [[Paper]](https://arxiv.org/abs/2103.06922) [[Notes]](notes/Towards_Interpreting_and_Mitigating_Shortcut_Learning_Behavior_of_NLU_Models.pdf)\n- Transformer Interpretability Beyond Attention Visualization (2020): [[Paper]](https://arxiv.org/abs/2012.09838) [[Notes]](notes/Transformer_Interpretability_Beyond_Attention_Visualization.pdf)\n- What shapes feature representations Exploring datasets architectures and training (2020): [[Paper]](https://arxiv.org/abs/2006.12433) [[Notes]](notes/What_shapes_feature_representations_Exploring_datasets_architectures_and_training.pdf)\n- Attention-based Dropout Layer for Weakly Supervised Object Localization (2019): [[Paper]](http://openaccess.thecvf.com/content_CVPR_2019/papers/Choe_Attention-Based_Dropout_Layer_for_Weakly_Supervised_Object_Localization_CVPR_2019_paper.pdf) [[Notes]](notes/58_attention_based_dropout.pdf)\n- Attention is not Explanation (2019): [[Paper]](https://arxiv.org/abs/1902.10186) [[Notes]](notes/Attention_is_not_Explanation.pdf)\n- SmoothGrad removing noise by adding noise (2017): [[Paper]](https://arxiv.org/abs/1706.03825) [[Notes]](notes/SmoothGrad_removing_noise_by_adding_noise.pdf)\n- Axiomatic Attribution for Deep Networks (2017): [[Paper]](https://arxiv.org/abs/1703.01365) [[Notes]](notes/Axiomatic_Attribution_for_Deep_Networks.pdf)\n- Attention Branch Network: Learning of Attention Mechanism for Visual Explanation (2019): [[Paper]](https://arxiv.org/abs/1812.10025) [[Notes]](notes/57_attention_branch_netwrok.pdf)\n- Paying More Attention to Attention: Improving the Performance of CNNs via Attention Transfer (2016): [[Paper]](https://arxiv.org/abs/1612.03928) [[Notes]](notes/71_attention_transfer.pdf)\n\n## Natural Language Processing (NLP)\n\n- Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing (2021): [[Paper]](https://arxiv.org/abs/2107.13586) [[Notes]](notes/Pre-train,_Prompt,_and_Predict:_A_Systematic_Survey_of_Prompting_Methods_in_Natural_Language_Processing.pdf)\n- Unsupervised Data Augmentation with Naive Augmentation and without Unlabeled Data (2020): [[Paper]](https://arxiv.org/abs/2010.11966) [[Notes]](notes/102_uda_with_naive_aug.pdf)\n- Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning (2021): [[Paper]](https://arxiv.org/abs/2011.01403) [[Notes]](notes/104_supervised_const_for_fine_tuning.pdf)\n- BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning (2020): [[Paper]](https://arxiv.org/abs/1902.02671) [[Notes]](notes/99_bert_and_pals.pdf)\n- FreeLB: Enhanced Adversarial Training for Natural Language Understanding (2020): [[Paper]](https://arxiv.org/abs/1909.11764) [[Notes]](notes/101_freeLB.pdf)\n- MixText: Linguistically-Informed Interpolation for Semi-Supervised Text Classification (2020): [[Paper]](https://arxiv.org/abs/2004.12239) [[Notes]](notes/100_mixtext.pdf)\n\n## Generative Modeling\n\n- Generative Pretraining from Pixels (2020): [[Paper]](https://cdn.openai.com/papers/Generative_Pretraining_from_Pixels_V2.pdf) [[Notes]](notes/96_generative_pretraining_from_pixels.pdf)\n- Consistency Regularization for Generative Adversarial Networks (2020): [[Paper]](https://arxiv.org/abs/1910.12027) [[Notes]](notes/84_CR_GANs.pdf)\n\n## Unsupervised Learning\n- Invariant Information Clustering for Unsupervised Image Classification and Segmentation (2019): [[Paper]](https://arxiv.org/abs/1807.06653) [[Notes]](notes/78_IIC.pdf)\n- Deep Clustering for Unsupervised Learning of Visual Feature (2018): [[Paper]](https://arxiv.org/abs/1807.05520) [[Notes]](notes/70_deep_clustering_for_un_visual_features.pdf)\n\n## Semantic Segmentation\n- DeepLabv3+: Encoder-Decoder with Atrous Separable Convolution (2018): [[Paper]](https://arxiv.org/abs/1802.02611) [[Notes]](notes/26_deeplabv3+.pdf)\n- Large Kernel Matter, Improve Semantic Segmentation by Global Convolutional Network (2017): [[Paper]](https://arxiv.org/abs/1703.02719) [[Notes]](notes/28_large_kernel_maters.pdf)\n- Understanding Convolution for Semantic Segmentation (2018): [[Paper]](https://arxiv.org/abs/1702.08502) [[Notes]](notes/29_understanding_conv_for_sem_seg.pdf)\n- Rethinking Atrous Convolution for Semantic Image Segmentation (2017): [[Paper]](https://arxiv.org/abs/1706.05587) [[Notes]](notes/25_deeplab_v3.pdf)\n- RefineNet: Multi-path refinement networks for high-resolution semantic segmentation (2017): [[Paper]](https://arxiv.org/abs/1611.06612) [[Notes]](notes/31_refinenet.pdf)\n- Pyramid Scene Parsing Network (2017): [[Paper]](http://jiaya.me/papers/PSPNet_cvpr17.pdf) [[Notes]](notes/22_pspnet.pdf)\n- SegNet: A Deep ConvolutionalEncoder-Decoder Architecture for ImageSegmentation (2016): [[Paper]](https://arxiv.org/pdf/1511.00561) [[Notes]](notes/21_segnet.pdf)\n- ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation (2016): [[Paper]](https://arxiv.org/abs/1606.02147) [[Notes]](notes/27_enet.pdf)\n- Attention to Scale: Scale-aware Semantic Image Segmentation (2016): [[Paper]](https://arxiv.org/abs/1511.03339) [[Notes]](notes/30_atttention_to_scale.pdf)\n- Deeplab: semantic image segmentation with DCNN, atrous convs and CRFs (2016): [[Paper]](https://arxiv.org/abs/1606.00915) [[Notes]](notes/23_deeplab_v2.pdf)\n- U-Net: Convolutional Networks for Biomedical Image Segmentation (2015): [[Paper]](https://arxiv.org/abs/1505.04597) [[Notes]](notes/20_Unet.pdf)\n- Fully Convolutional Networks for Semantic Segmentation (2015): [[Paper]](https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf) [[Notes]](notes/19_FCN.pdf)\n- Hypercolumns for object segmentation and fine-grained localization (2015): [[Paper]](http://home.bharathh.info/pubs/pdfs/BharathCVPR2015.pdf) [[Notes]](notes/24_hypercolumns.pdf)\n\n\n## Weakly- and Semi-supervised Semantic segmentation\n- Box-driven Class-wise Region Masking and Filling Rate Guided Loss (2019): [[Paper]](http://arxiv.org/abs/1904.11693) [[Notes]](notes/54_boxe_driven_weakly_segmentation.pdf)\n- FickleNet: Weakly and Semi-supervised Semantic Segmentation using Stochastic Inference (2019): [[Paper]](https://arxiv.org/abs/1902.10421) [[Notes]](notes/49_ficklenet.pdf)\n- Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing (2018): [[Paper]](http://openaccess.thecvf.com/content_cvpr_2018/papers/Huang_Weakly-Supervised_Semantic_Segmentation_CVPR_2018_paper.pdf) [[Notes]](notes/53_deep_seeded_region_growing.pdf)\n- Learning Pixel-level Semantic Affinity with Image-level Supervision (2018): [[Paper]](https://arxiv.org/abs/1803.10464) [[Notes]](notes/81_affinity_for_ws_segmentation.pdf)\n- Object Region Mining with Adversarial Erasing (2018): [[Paper]](https://arxiv.org/abs/1703.08448) [[Notes]](notes/51_object_region_manning_for_sem_seg.pdf)\n- Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Segmentation (2018): [[Paper]](https://arxiv.org/abs/1805.04574) [[Notes]](notes/52_dilates_convolution_semi_super_segmentation.pdf)\n- Tell Me Where to Look: Guided Attention Inference Network (2018): [[Paper]](https://arxiv.org/abs/1802.10171) [[Notes]](notes/50_tell_me_where_to_look.pdf)\n- Semi Supervised Semantic Segmentation Using Generative Adversarial Network (2017): [[Paper]](https://arxiv.org/abs/1703.09695) [[Notes]](notes/82_ss_segmentation_gans.pdf)\n- Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation (2015): [[Paper]](https://arxiv.org/abs/1506.04924) [[Notes]](notes/47_decoupled_nn_for_segmentation.pdf)\n- Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation (2015): [[Paper]](https://arxiv.org/abs/1502.02734) [[Notes]](notes/48_weakly_and_ss_for_segmentation.pdf)\n\n\n## Information Retrieval\n- VSE++: Improving Visual-Semantic Embeddings with Hard Negatives (2018): [[Paper]](https://arxiv.org/abs/1707.05612) [[Notes]](notes/77_vse++.pdf)\n\n\n## Graph Neural Network\n- Pixels to Graphs by Associative Embedding (2017): [[Paper]](https://arxiv.org/abs/1706.07365) [[Notes]](notes/36_pixels_to_graphs.pdf)\n- Associative Embedding: End-to-End Learning forJoint Detection and Grouping (2017): [[Paper]](https://arxiv.org/abs/1611.05424) [[Notes]](notes/35_associative_emb.pdf)\n- Interaction Networks for Learning about Objects , Relations and Physics (2016): [[Paper]](https://arxiv.org/abs/1612.00222) [[Notes]](notes/18_interaction_nets.pdf)\n- DeepWalk: Online Learning of Social Representation (2014): [[Paper]](http://www.perozzi.net/publications/14_kdd_deepwalk.pdf) [[Notes]](notes/deep_walk.pdf)\n- The graph neural network model (2009): [[Paper]](https://persagen.com/files/misc/scarselli2009graph.pdf) [[Notes]](notes/graph_neural_nets.pdf)\n\n## Regularization\n- Manifold Mixup: Better Representations by Interpolating Hidden States (2018): [[Paper]](https://arxiv.org/abs/1806.05236) [[Notes]](notes/43_manifold_mixup.pdf)\n\n## Deep learning Methods \u0026 Models\n- AutoAugment (2018): [[Paper]](https://arxiv.org/abs/1805.09501) [[Notes]](notes/41_autoaugment.pdf)\n- Stacked Hourgloass (2017): [[Paper]](http://ismir2018.ircam.fr/doc/pdfs/138_Paper.pdf) [[Notes]](notes/34_stacked_hourglass.pdf)\n\n\n## Document analysis and segmentation\n- dhSegment: A generic deep-learning approach for document segmentation (2018): [[Paper]](https://arxiv.org/abs/1804.10371) [[Notes]](notes/dhSegement.pdf)\n- Learning to extract semantic structure from documents using multimodal fully convolutional neural networks (2017): [[Paper]](https://arxiv.org/abs/1706.02337) [[Notes]](notes/learning_to_extract.pdf)\n- Page Segmentation for Historical Handwritten Document Images Using Conditional Random Fields (2016): [[Paper]](https://www.researchgate.net/publication/312486501_Page_Segmentation_for_Historical_Handwritten_Document_Images_Using_Conditional_Random_Fields) [[Notes]](notes/seg_with_CRFs.pdf)\n- ICDAR 2015 competition on text line detection in historical documents (2015): [[Paper]](http://ieeexplore.ieee.org/abstract/document/7333945/) [[Notes]](notes/ICDAR2015.pdf)\n- Handwritten text line segmentation using Fully Convolutional Network (2017): [[Paper]](https://ieeexplore.ieee.org/document/8270267/) [[Notes]](notes/handwritten_text_seg_FCN.pdf)\n- Deep Neural Networks for Large Vocabulary Handwritten Text Recognition (2015): [[Paper]](https://tel.archives-ouvertes.fr/tel-01249405/document) [[Notes]](notes/andwriten_text_recognition.pdf)\n- Page Segmentation of Historical Document Images with Convolutional Autoencoders (2015): [[Paper]](https://ieeexplore.ieee.org/abstract/document/7333914/) [[Notes]](notes/segmentation_with_CAE.pdf)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyassouali%2Fml-paper-notes","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyassouali%2Fml-paper-notes","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyassouali%2Fml-paper-notes/lists"}