Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with vision-transformer
A curated list of projects in awesome lists tagged with vision-transformer .
https://github.com/open-mmlab/mmdetection
OpenMMLab Detection Toolbox and Benchmark
cascade-rcnn convnext detr fast-rcnn faster-rcnn glip grounding-dino instance-segmentation mask-rcnn object-detection panoptic-segmentation pytorch retinanet rtmdet semisupervised-learning ssd swin-transformer transformer vision-transformer yolo
Last synced: 16 Dec 2024
https://github.com/lukas-blecher/latex-ocr
pix2tex: Using a ViT to convert images of equations into LaTeX code.
dataset deep-learning im2latex im2markup im2text image-processing image2text latex latex-ocr machine-learning math-ocr ocr python pytorch transformer vision-transformer vit
Last synced: 16 Dec 2024
https://github.com/lukas-blecher/LaTeX-OCR
pix2tex: Using a ViT to convert images of equations into LaTeX code.
dataset deep-learning im2latex im2markup im2text image-processing image2text latex latex-ocr machine-learning math-ocr ocr python pytorch transformer vision-transformer vit
Last synced: 30 Oct 2024
https://github.com/nielsrogge/transformers-tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
bert gpt-2 layoutlm pytorch transformers vision-transformer
Last synced: 16 Dec 2024
https://github.com/NielsRogge/Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
bert gpt-2 layoutlm pytorch transformers vision-transformer
Last synced: 30 Oct 2024
https://github.com/adithya-s-k/omniparse
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
ingestion-api ocr omniparser parse-server parser-library vision-transformer web-crawler whisper-api
Last synced: 17 Dec 2024
https://github.com/jingyunliang/swinir
SwinIR: Image Restoration Using Swin Transformer (official repository)
compression-artifact-reduction deblocking decompression denoising image-deblocking image-denoising image-restoration image-sr image-super-resolution lightweight-image-super-resolution low-level-vision real-world-image-super-resolution restoration super-resolution transformer vision-transformer
Last synced: 19 Dec 2024
https://github.com/JingyunLiang/SwinIR
SwinIR: Image Restoration Using Swin Transformer (official repository)
compression-artifact-reduction deblocking decompression denoising image-deblocking image-denoising image-restoration image-sr image-super-resolution lightweight-image-super-resolution low-level-vision real-world-image-super-resolution restoration super-resolution transformer vision-transformer
Last synced: 13 Nov 2024
https://github.com/foundationvision/var
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
auto-regressive-model autoregressive-models diffusion-models generative-ai generative-model gpt gpt-2 image-generation large-language-models neurips transformers vision-transformer
Last synced: 17 Dec 2024
https://github.com/FoundationVision/VAR
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
auto-regressive-model autoregressive-models diffusion-models generative-ai generative-model gpt gpt-2 image-generation large-language-models neurips transformers vision-transformer
Last synced: 04 Nov 2024
https://github.com/huawei-noah/efficient-ai-backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
convolutional-neural-networks efficient-inference ghostnet imagenet model-compression pretrained-models pytorch tensorflow transformer vision-transformer
Last synced: 17 Dec 2024
https://github.com/huawei-noah/Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
convolutional-neural-networks efficient-inference ghostnet imagenet model-compression pretrained-models pytorch tensorflow transformer vision-transformer
Last synced: 28 Oct 2024
https://github.com/open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
beit clip constrastive-learning convnext deep-learning image-classification mae masked-image-modeling mobilenet moco multimodal pretrained-models pytorch resnet self-supervised-learning swin-transformer vision-transformer
Last synced: 21 Dec 2024
https://github.com/google-research/scenic
Scenic: A Jax Library for Computer Vision Research and Beyond
attention computer-vision deep-learning jax research transformers vision-transformer
Last synced: 17 Dec 2024
https://github.com/towhee-io/towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
computer-vision convolutional-networks embedding-vectors embeddings feature-extraction feature-vector image-processing image-retrieval llm machine-learning milvus pipeline towhee transformer unstructured-data video-processing vision-transformer vit
Last synced: 16 Dec 2024
https://github.com/InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
chatgpt foundation gpt gpt-4 instruction-tuning language-model large-language-model large-vision-language-model llm mllm multi-modality multimodal supervised-finetuning vision-language-model vision-transformer visual-language-learning
Last synced: 14 Nov 2024
https://github.com/internlm/internlm-xcomposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
chatgpt foundation gpt gpt-4 instruction-tuning language-model large-language-model large-vision-language-model llm mllm multi-modality multimodal supervised-finetuning vision-language-model vision-transformer visual-language-learning
Last synced: 19 Dec 2024
https://github.com/mit-han-lab/efficientvit
Efficient vision foundation models for high-resolution generation and perception.
deep-compression-autoencoder efficient-diffusion-model efficientvit high-resolution imagenet segment-anything segmentation vision-transformer
Last synced: 17 Dec 2024
https://github.com/baaivision/eva
EVA Series: Visual Representation Fantasies from BAAI
foundation-models representation-learning vision-transformer
Last synced: 19 Dec 2024
https://github.com/baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
foundation-models representation-learning vision-transformer
Last synced: 28 Oct 2024
https://github.com/hila-chefer/transformer-explainability
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
attention-matrix attention-visualization bert bert-model cvpr2021 deep-learning explainability perturbation transformer-interpretability vision-transformer visualize-classifications vit
Last synced: 21 Dec 2024
https://github.com/alibaba/easycv
An all-in-one toolkit for computer vision
classification computer-vision object-detection pytorch self-supervised-learning transformers vision-transformer
Last synced: 17 Dec 2024
https://github.com/hila-chefer/Transformer-Explainability
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
attention-matrix attention-visualization bert bert-model cvpr2021 deep-learning explainability perturbation transformer-interpretability vision-transformer visualize-classifications vit
Last synced: 30 Oct 2024
https://github.com/alibaba/EasyCV
An all-in-one toolkit for computer vision
classification computer-vision object-detection pytorch self-supervised-learning transformers vision-transformer
Last synced: 26 Oct 2024
https://github.com/microsoft/cream
This is a collection of our NAS and Vision Transformer work.
automl efficiency knowledge-distillation nas rpe vision-transformer vit-compression
Last synced: 19 Dec 2024
https://github.com/microsoft/Cream
This is a collection of our NAS and Vision Transformer work.
automl efficiency knowledge-distillation nas rpe vision-transformer vit-compression
Last synced: 05 Nov 2024
https://github.com/vitae-transformer/vitpose
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
deep-learning distillation mae pose-estimation pytorch self-supervised-learning vision-transformer
Last synced: 19 Dec 2024
https://github.com/jingyunliang/vrt
VRT: A Video Restoration Transformer (official repository)
deblurring denoising low-level-vision restoration sr super-resolution transformer video video-deblurring video-denoising video-restoration video-sr video-super-resolution vision-transformer
Last synced: 15 Dec 2024
https://github.com/JingyunLiang/VRT
VRT: A Video Restoration Transformer (official repository)
deblurring denoising low-level-vision restoration sr super-resolution transformer video video-deblurring video-denoising video-restoration video-sr video-super-resolution vision-transformer
Last synced: 06 Nov 2024
https://github.com/OpenGVLab/InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
action-recognition benchmark contrastive-learning foundation-models instruction-tuning masked-autoencoder multimodal open-set-recognition self-supervised spatio-temporal-action-localization temporal-action-localization video-clip video-data video-dataset video-question-answering video-retrieval video-understanding vision-transformer zero-shot-classification zero-shot-retrieval
Last synced: 28 Oct 2024
https://github.com/czczup/vit-adapter
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
adapter object-detection semantic-segmentation vision-transformer
Last synced: 15 Dec 2024
https://github.com/MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
action-recognition mae masked-autoencoder neurips-2022 pytorch self-supervised-learning transformer video-analysis video-representation-learning video-transformer video-understanding vision-transformer
Last synced: 27 Oct 2024
https://github.com/czczup/ViT-Adapter
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
adapter object-detection semantic-segmentation vision-transformer
Last synced: 04 Nov 2024
https://github.com/emcf/thepipe
Extract clean data from anywhere, powered by vision-language models ⚡
gpt-4 gpt-4o large-language-models multimodal pdf scrapers vision-transformer web
Last synced: 19 Dec 2024
https://github.com/yitu-opensource/T2T-ViT
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
t2t-transformer vision-transformer vit
Last synced: 13 Nov 2024
https://github.com/nvlabs/voxformer
Official PyTorch implementation of VoxFormer [CVPR 2023 Highlight]
2d-to-3d 3d-perception 3d-scene-understanding artificial-intelligence autonomous-driving autonomous-vehicles computer-vision deep-learning machine-learning occupancy-grid-map semantic-scene-completion semantickitti vision-transformer voxel-proceessing
Last synced: 16 Dec 2024
https://github.com/NVlabs/VoxFormer
Official PyTorch implementation of VoxFormer [CVPR 2023 Highlight]
2d-to-3d 3d-perception 3d-scene-understanding artificial-intelligence autonomous-driving autonomous-vehicles computer-vision deep-learning machine-learning occupancy-grid-map semantic-scene-completion semantickitti vision-transformer voxel-proceessing
Last synced: 28 Oct 2024
https://github.com/OFA-Sys/ONE-PEACE
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
audio-language contrastive-loss foundation-models multimodal representation-learning vision-and-language vision-language vision-transformer
Last synced: 29 Nov 2024
https://github.com/jacobgil/vit-explain
Explainability for Vision Transformers
deep-learning explainable-ai pytorch transformer vision-transformer
Last synced: 20 Dec 2024
https://github.com/hustvl/yolos
[NeurIPS 2021] You Only Look at One Sequence
computer-vision object-detection transformer vision-transformer
Last synced: 18 Dec 2024
https://github.com/hustvl/YOLOS
[NeurIPS 2021] You Only Look at One Sequence
computer-vision object-detection transformer vision-transformer
Last synced: 09 Nov 2024
https://github.com/xxxnell/how-do-vits-work
(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"
loss-landscape pytorch self-attention transformer vision-transformer
Last synced: 15 Nov 2024
https://github.com/NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention
ade20k backbone coco deep-learning foundation-models image-classification image-net object-detection pre-trained-model self-attention semantic-segmentation vision-transformer visual-recognition
Last synced: 28 Oct 2024
https://github.com/sunzey/alphaclip
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
deep-learning machine-learning vision-and-language vision-language vision-language-model vision-transformer
Last synced: 21 Dec 2024
https://github.com/Alibaba-MIIL/ImageNet21K
Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper
downstream imagenet21k mixer multi-label-classification pretraining semantic-softmax single-label vision-transformer
Last synced: 26 Oct 2024
https://github.com/4DVLab/Vision-Centric-BEV-Perception
Vision-Centric BEV Perception: A Survey
bev-perception bird-eye-view deep-learning transformer vision-transformer
Last synced: 28 Oct 2024
https://github.com/baudm/parseq
Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)
computer-vision eccv eccv2022 ocr optical-character-recognition scene-text-recognition text-recognition vision-transformer
Last synced: 20 Dec 2024
https://github.com/blaizzy/mlx-vlm
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
apple-silicon florence2 idefics llava llm local-ai mlx molmo paligemma pixtral vision-framework vision-language-model vision-transformer
Last synced: 19 Dec 2024
https://github.com/mv-lab/swin2sr
[ECCV] Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration. Advances in Image Manipulation (AIM) workshop ECCV 2022. Try it out! over 3.3M runs https://replicate.com/mv-lab/swin2sr
compression compression-artifact-reduction computer-vision deblocking deep-learning denoising eccv2022 image-denoising image-processing image-restoration image-sr image-super-resolution jpeg low-level-vision ntire super-resolution swin2sr swinir transformer vision-transformer
Last synced: 06 Nov 2024
https://github.com/vitae-transformer/vitdet
Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object Detection"
deep-learning object-detection pytorch vision-transformer
Last synced: 15 Dec 2024
https://github.com/ViTAE-Transformer/ViTDet
Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object Detection"
deep-learning object-detection pytorch vision-transformer
Last synced: 15 Nov 2024
https://github.com/jdai-cv/cotnet
This is an official implementation for "Contextual Transformer Networks for Visual Recognition".
contextual-transformer cotnet image-classification imagenet instance-segmentation mask-rcnn mscoco object-detection semantic-segmentation vision-transformer
Last synced: 15 Dec 2024
https://github.com/mahmoodlab/hipt
Hierarchical Image Pyramid Transformer - CVPR 2022 (Oral)
computational-pathology cvpr cvpr2022 deep-learning hierarchical-attention-networks high-resolution histopathology pretrained-weights pytorch self-supervised-learning transfer-learning unsupervised-learning vision-transformer weakly-supervised-learning
Last synced: 15 Dec 2024
https://github.com/mahmoodlab/HIPT
Hierarchical Image Pyramid Transformer - CVPR 2022 (Oral)
computational-pathology cvpr cvpr2022 deep-learning hierarchical-attention-networks high-resolution histopathology pretrained-weights pytorch self-supervised-learning transfer-learning unsupervised-learning vision-transformer weakly-supervised-learning
Last synced: 13 Nov 2024
https://github.com/Blaizzy/mlx-vlm
MLX-VLM is a package for running Vision LLMs locally on your Mac using MLX.
apple-silicon florence2 idefics llava llm local-ai mlx molmo paligemma pixtral vision-framework vision-language-model vision-transformer
Last synced: 25 Nov 2024
https://github.com/ViTAE-Transformer/ViTAE-Transformer-Remote-Sensing
A comprehensive list [SAMRS@NeurIPS'23, RVSA@TGRS'22, RSP@TGRS'22] of our research works related to remote sensing, including papers, codes, and citations. Note: The repo for [TGRS'22] "An Empirical Study of Remote Sensing Pretraining" has been moved to: https://github.com/ViTAE-Transformer/RSP
change-detection classification deep-learning object-detection remote-sensing self-supervised-learning semantic-segmentation transfer-learning vision-transformer
Last synced: 15 Nov 2024
https://github.com/vitae-transformer/vitae-transformer-remote-sensing
A comprehensive list [SAMRS@NeurIPS'23, RVSA@TGRS'22, RSP@TGRS'22] of our research works related to remote sensing, including papers, codes, and citations. Note: The repo for [TGRS'22] "An Empirical Study of Remote Sensing Pretraining" has been moved to: https://github.com/ViTAE-Transformer/RSP
change-detection classification deep-learning object-detection remote-sensing self-supervised-learning semantic-segmentation transfer-learning vision-transformer
Last synced: 14 Nov 2024
https://github.com/google-research/maxvit
[ECCV 2022] Official repository for "MaxViT: Multi-Axis Vision Transformer". SOTA foundation models for classification, detection, segmentation, image quality, and generative modeling...
architecture classification cnn computer-vision image image-processing mlp object-detection resnet segmentation transformer transformer-architecture vision-transformer
Last synced: 17 Nov 2024
https://github.com/NVlabs/GCVit
[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers
ade20k backbone coco deep-learning imagenet imagenet-classification object-detection pre-train pre-trained-model self-attention semantic-segmentation vision-transformer visual-recognition
Last synced: 15 Nov 2024
https://github.com/vitae-transformer/remote-sensing-rvsa
The official repo for [TGRS'22] "Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model"
deep-learning foundation-model foundation-models object-detection pytorch remote-sensing remote-sensing-foundation-model scene-classification self-supervised-learning semantic-segmentation transfer-learning vision-transformer
Last synced: 15 Dec 2024
https://github.com/raoyongming/GFNet
[NeurIPS 2021] [T-PAMI] Global Filter Networks for Image Classification
computer-vision deep-learning image-classification image-recognition vision-transformer
Last synced: 15 Nov 2024
https://github.com/rentainhe/visualization
a collection of visualization function
attention attention-map attention-mechanism data-visualization deep-learning transformer vision vision-mlp vision-transformer visualization
Last synced: 21 Dec 2024
https://github.com/oneflow-inc/libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
data-parallelism deep-learning distributed-training large-scale model-parallelism nlp oneflow pipeline-parallelism self-supervised-learning transformer vision-transformer
Last synced: 15 Dec 2024
https://github.com/Oneflow-Inc/libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
data-parallelism deep-learning distributed-training large-scale model-parallelism nlp oneflow pipeline-parallelism self-supervised-learning transformer vision-transformer
Last synced: 16 Nov 2024
https://github.com/omerbt/splice
Official Pytorch Implementation for "Splicing ViT Features for Semantic Appearance Transfer" presenting "Splice" (CVPR 2022 Oral)
cvpr2022 generative-models image-translation single-image single-image-generation splice style-transfer vision-transformer
Last synced: 16 Dec 2024
https://github.com/omerbt/Splice
Official Pytorch Implementation for "Splicing ViT Features for Semantic Appearance Transfer" presenting "Splice" (CVPR 2022 Oral)
cvpr2022 generative-models image-translation single-image single-image-generation splice style-transfer vision-transformer
Last synced: 15 Nov 2024
https://github.com/asyml/vision-transformer-pytorch
Pytorch version of Vision Transformer (ViT) with pretrained models. This is part of CASL (https://casl-project.github.io/) and ASYML project.
Last synced: 17 Dec 2024
https://github.com/hustvl/mimdet
[ICCV 2023] You Only Look at One Partial Sequence
computer-vision instance-segmentation mae masked-image-modeling object-detection transformer vision-transformer
Last synced: 16 Dec 2024
https://github.com/IBM/CrossViT
Official implementation of CrossViT. https://arxiv.org/abs/2103.14899
computer-vision deep-learning multi-scale-features vision-transformer
Last synced: 14 Nov 2024
https://github.com/shoufachen/adaptformer
[NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"
adapter neurips-2022 recognition vision-transformer visual-adapter
Last synced: 20 Dec 2024
https://github.com/xmed-lab/CLIP_Surgery
CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks
clip explainability interpretability multilabel multimodal open-vocabulary sam segment-anything segmentation vision-transformer
Last synced: 27 Oct 2024
https://github.com/megvii-research/FQ-ViT
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
imagenet post-training-quantization pytorch quantization vision-transformer
Last synced: 28 Oct 2024
https://github.com/ZhangGongjie/SAM-DETR
[CVPR'2022] SAM-DETR & SAM-DETR++: Official PyTorch Implementation
computer-vision cvpr cvpr2022 deep-learning detection detr machine-learning object-detection pytorch transformer vision vision-transformer
Last synced: 28 Oct 2024
https://github.com/roatienza/deep-text-recognition-benchmark
PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
ocr str vision-transformer vitstr
Last synced: 15 Dec 2024
https://github.com/martinsbruveris/tensorflow-image-models
TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights.
imagenet tensorflow vision-transformer
Last synced: 15 Nov 2024
https://github.com/paddlepaddle/passl
PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法
beit clip convnext cvt deep-learning deit mae moco moco-v2 paddle pixpro pvt self-supervised-learning simclr swav swin-transformer vision-transformer vit xcit
Last synced: 21 Dec 2024
https://github.com/dwctod/eccv2022-papers-with-code-demo
收集 ECCV 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!
ai computer-vision cv dataset diffusion eccv eccv2022 face-recognition image-segmentation multimodal-deep-learning nerf objection-detection vision-transformer
Last synced: 21 Nov 2024
https://github.com/DerrickXuNu/v2x-vit
[ECCV2022] Official Implementation of paper "V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer"
3d-object-detection autonomous-driving collaborative-perception computer-vision deep-learning machine-learning multi-agent-system pytorch simulation v2x vehicle-to-everything vision-transformer
Last synced: 28 Oct 2024
https://github.com/jingyunliang/rvrt
Recurrent Video Restoration Transformer with Guided Deformable Attention (NeurlPS2022, official repository)
deblurring denoising low-level-vision restoraton sr super-resolution transformer video video-deblurring video-denoising video-restoration video-sr video-super-resolution vision-transformer
Last synced: 18 Dec 2024
https://github.com/Haiyang-W/GiT
[ECCV2024 Oral🔥] Official Implementation of "GiT: Towards Generalist Vision Transformer through Universal Language Interface"
foundation-models perception transformer unified vision-and-language vision-transformer
Last synced: 28 Oct 2024
https://github.com/vitae-transformer/vitae-transformer
The official repo for [NeurIPS'21] "ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias" and [IJCV'22] "ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond"
ade20k deep-learning imagenet imagenet-classification mscoco object-detection semantic-segmentation vision-transformer vitae-transformer
Last synced: 18 Dec 2024
https://github.com/staghado/vit.cpp
Inference Vision Transformer (ViT) in plain C/C++ with ggml
ai c computer-vision cpp cpu edge-computing ggml image-classification llamacpp vision-transformer whisper-cpp
Last synced: 17 Dec 2024
https://github.com/paddlepaddle/interpretdl
InterpretDL: Interpretation of Deep Learning Models,基于『飞桨』的模型可解释性算法库。
convolutional-neural-networks explanations grad-cam interpretation-algorithms lime model-interpretation nlp-models paddlepaddle smoothgrad vision-transformer visualizations
Last synced: 15 Dec 2024
https://github.com/PaddlePaddle/InterpretDL
InterpretDL: Interpretation of Deep Learning Models,基于『飞桨』的模型可解释性算法库。
convolutional-neural-networks explanations grad-cam interpretation-algorithms lime model-interpretation nlp-models paddlepaddle smoothgrad vision-transformer visualizations
Last synced: 17 Nov 2024
https://github.com/ziqipang/lm4visualencoding
[ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"
Last synced: 18 Dec 2024
https://github.com/NVIDIA/transformer-ls
Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).
efficient-transformers long-sequence transformer vision-transformer
Last synced: 16 Nov 2024
https://github.com/nvidia/transformer-ls
Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).
efficient-transformers long-sequence transformer vision-transformer
Last synced: 29 Oct 2024
https://github.com/vitae-transformer/vitae-transformer-matting
A comprehensive list of our research works related to image matting, including papers, codes, datasets, demos, and citations. Note: The repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving" has been moved to: https://github.com/ViTAE-Transformer/P3M-Net
computer-vision deep-learning image-matting privacy-preserving survey vision-transformer
Last synced: 14 Nov 2024
https://github.com/AnshMittal1811/MachineLearning-AI
This repository contains all the work that I regularly did and studied from Medium blogs, several research papers, and other Repos (related/unrelated to the research papers).
3d-computer-vision audio-signal-processing computer-vision convolutional-neural-networks deep-learning deep-neural-networks generative-models gradcam graph-neural-networks image-classification lidar-point-cloud machine-learning neural-network neural-networks neural-radiance-fields neural-rendering pytorch transformers vision-transformer
Last synced: 28 Oct 2024
https://github.com/zhongkaifu/seq2seqsharp
Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
attention-model cuda deep-learning encoder-decoder gpu image lstm machine-translation neural-network seq2seq sequence-to-sequence tensor text transformer transformer-architecture transformer-encoder translation vision-transformer
Last synced: 21 Dec 2024
https://github.com/biasvariancelabs/aitlas
AiTLAS implements state-of-the-art AI methods for exploratory and predictive analysis of satellite images.
artificial-intelligence classification computer-vision dataset deep-learning deep-neural-networks earth-observation geospatial image-classification machine-learning models object-detection pytorch remote-sensing satellite-data satellite-images segmentation sentinel vision-transformer
Last synced: 15 Nov 2024
https://github.com/vitae-transformer/qformer
The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"
attention-mechanism backbone classification deep-learning object-detection pose-estimation semantic-segmentation vision-transformer
Last synced: 19 Dec 2024
https://github.com/chou141253/FGVC-PIM
Pytorch implementation for "A Novel Plug-in Module for Fine-Grained Visual Classification". fine-grained visual classification task.
efficientnet fgvc fine-grained-visual-categorization resnet swin-transformer vision-transformer
Last synced: 05 Nov 2024
https://github.com/salvatorera/tutorial
Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python and R)
artificial-intelligence bioinformatics biology computer-vision convolutional-neural-networks data-science deep-learning graph image machine-learning natural-language-processing nlp python r streamlit streamlit-webapp tutorial tutorials vision-transformer
Last synced: 21 Dec 2024
https://github.com/SforAiDl/vformer
A modular PyTorch library for vision transformer models
Last synced: 15 Nov 2024
https://github.com/bfshi/AbSViT
Official code for "Top-Down Visual Attention from Analysis by Synthesis" (CVPR 2023 highlight)
attention classification cvpr pytorch segmentation vision-transformer
Last synced: 09 Nov 2024
https://github.com/trzy/llava-cpp-server
LLaVA server (llama.cpp).
llama llama2 llava llm multimodal vision-transformer
Last synced: 17 Dec 2024
https://github.com/kyegomez/vit-rgts
Open source implementation of "Vision Transformers Need Registers"
attention-mechanism gpt4 vision-api vision-transformer vit
Last synced: 20 Dec 2024
https://github.com/richarizardd/self-supervised-vit-path
Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology (LMRL Workshop, NeurIPS 2021)
computational-pathology deep-learning histopathology neurips pretrained-weights pytorch self-supervised-learning transfer-learning unsupervised-learning vision-transformer weakly-supervised-learning
Last synced: 24 Nov 2024
https://github.com/Richarizardd/Self-Supervised-ViT-Path
Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology (LMRL Workshop, NeurIPS 2021)
computational-pathology deep-learning histopathology neurips pretrained-weights pytorch self-supervised-learning transfer-learning unsupervised-learning vision-transformer weakly-supervised-learning
Last synced: 13 Nov 2024