Projects in Awesome Lists tagged with foundation-model

https://github.com/guardrails-ai/guardrails

Adding guardrails to large language models.

Last synced: 16 Mar 2026

https://github.com/OpenGVLab/InternGPT

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

chatgpt click draggan foundation-model gpt gpt-4 gradio husky image-captioning imagebind internimage langchain llama llm multimodal sam segment-anything vicuna video-generation vqa

Last synced: 27 Mar 2025

https://github.com/opengvlab/interngpt

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

chatgpt click draggan foundation-model gpt gpt-4 gradio husky image-captioning imagebind internimage langchain llama llm multimodal sam segment-anything vicuna video-generation vqa

Last synced: 14 May 2025

https://github.com/opengvlab/internimage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

backbone deformable-convolution foundation-model object-detection semantic-segmentation

Last synced: 10 Apr 2025

https://github.com/OpenGVLab/InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

backbone deformable-convolution foundation-model object-detection semantic-segmentation

Last synced: 20 Mar 2025

https://github.com/bowang-lab/scgpt

foundation-model gpt single-cell

Last synced: 14 May 2025

https://github.com/foundationvision/glee

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

foundation-model interactive-segmentation object-detection open-vocabulary-detection open-vocabulary-segmentation open-vocabulary-video-segmentation open-world referring-expression-comprehension referring-expression-segmentation referring-video-object-segmentation segment-anything tracking video-instance-segmentation video-object-segmentation zero-shot-object-detection

Last synced: 15 May 2025

https://github.com/FoundationVision/GLEE

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

foundation-model interactive-segmentation object-detection open-vocabulary-detection open-vocabulary-segmentation open-vocabulary-video-segmentation open-world referring-expression-comprehension referring-expression-segmentation referring-video-object-segmentation segment-anything tracking video-instance-segmentation video-object-segmentation zero-shot-object-detection

Last synced: 19 Jul 2025

https://github.com/idea-research/grounding-dino-1.5-api

Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

foundation-model grounding-dino object-detection open-set open-vocabulary-detection open-world zero-shot-object-detection

Last synced: 14 Apr 2025

https://github.com/IDEA-Research/Grounding-DINO-1.5-API

Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

foundation-model grounding-dino object-detection open-set open-vocabulary-detection open-world zero-shot-object-detection

Last synced: 27 Sep 2025

https://github.com/bowang-lab/scGPT

foundation-model gpt single-cell

Last synced: 01 May 2025

https://github.com/opendrivelab/driveagi

[CVPR 2024 Highlight] GenAD: Generalized Predictive Model for Autonomous Driving & Foundation Models in Autonomous System

autonomous-driving embodied-ai foundation-model general-artificial-intelligence large-dataset policy-learning video-dataset video-generation world-models

Last synced: 15 May 2025

https://github.com/ailab-cvc/seed

Official implementation of SEED-LLaMA (ICLR 2024).

foundation-model multimodal vision-language

Last synced: 09 Apr 2025

https://github.com/opengvlab/videomaev2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

action-detection action-recognition cvpr2023 foundation-model self-supervised-learning temporal-action-detection video-understanding

Last synced: 30 Jun 2025

https://github.com/OpenDriveLab/DriveAGI

[CVPR 2024 Highlight] GenAD: Generalized Predictive Model for Autonomous Driving & Foundation Models in Autonomous System

autonomous-driving embodied-ai foundation-model general-artificial-intelligence policy-learning

Last synced: 20 Mar 2025

https://github.com/OpenGVLab/VideoMAEv2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

action-detection action-recognition cvpr2023 foundation-model self-supervised-learning temporal-action-detection video-understanding

Last synced: 16 Mar 2025

https://github.com/Clay-foundation/model

The Clay Foundation Model - An open source AI model and interface for Earth

digital-elevation-model earth-observation embeddings foundation-model sentinel-1 sentinel-2

Last synced: 24 Sep 2025

https://github.com/mahmoodlab/TRIDENT

Toolkit for large-scale whole-slide image processing.

deep-learning foundation-model foundation-models-for-pathology histology pathology whole-slide-image

Last synced: 24 Feb 2026

https://clay-foundation.github.io/model/

The Clay Foundation Model - An open source AI model and interface for Earth

digital-elevation-model earth-observation embeddings foundation-model sentinel-1 sentinel-2

Last synced: 01 Aug 2025

https://github.com/mahmoodlab/UNI

Pathology Foundation Model - Nature Medicine

computational-pathology digital-pathology foundation foundation-model histopathology mahmoodlab mass-100k nature-medicine pathology pathology-dinov2 pathology-fm pathology-foundation pathology-foundation-model pathology-self-supervised quantitative-pathology uni uni-foundation-model

Last synced: 06 May 2025

https://github.com/mahmoodlab/uni

A general-purpose foundation model for computational pathology - Nature Medicine

computational-pathology digital-pathology foundation foundation-model histopathology mahmoodlab mass-100k nature-medicine pathology pathology-dinov2 pathology-fm pathology-foundation pathology-foundation-model pathology-self-supervised quantitative-pathology uni uni-foundation-model

Last synced: 15 May 2025

https://github.com/vitae-transformer/remote-sensing-rvsa

The official repo for [TGRS'22] "Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model"

deep-learning foundation-model foundation-models object-detection pytorch remote-sensing remote-sensing-foundation-model scene-classification self-supervised-learning semantic-segmentation transfer-learning vision-transformer

Last synced: 05 Apr 2025

https://github.com/mahmoodlab/conch

Vision-Language Pathology Foundation Model - Nature Medicine

bioimage-analysis bioimage-informatics computational-pathology conch digital-pathology foundation-model health-informatics histopathology mahmoodlab medical-imaging nature-medicine nlp-machine-learning pathology quantitative-pathology vision-language-pathology

Last synced: 16 May 2025

https://cambridgeltl.github.io/visual-med-alpaca/

Visual Med-Alpaca is an open-source, multi-modal foundation model designed specifically for the biomedical domain, built on the LLaMa-7B.

biomedical biomedical-image-processing foundation-model large-language-models multimodal

Last synced: 12 May 2025

https://github.com/opendrivelab/openscene

3D Occupancy Prediction Benchmark in Autonomous Driving

3d-occupancy autonomous-driving foundation-model

Last synced: 05 Apr 2025

https://github.com/mahmoodlab/CONCH

Vision-Language Pathology Foundation Model - Nature Medicine

bioimage-analysis bioimage-informatics computational-pathology conch digital-pathology foundation-model health-informatics histopathology mahmoodlab medical-imaging nature-medicine nlp-machine-learning pathology quantitative-pathology vision-language-pathology

Last synced: 06 May 2025

https://github.com/OpenDriveLab/OpenScene

3D Occupancy Prediction Benchmark in Autonomous Driving

3d-occupancy autonomous-driving foundation-model

Last synced: 20 Mar 2025

https://github.com/westlake-repl/recommendation-systems-without-explicit-id-features-a-literature-review

Paper List of Pre-trained Foundation Recommender Models

chatgpt chatgpt3 chatgpt4rec cross-domain-recommendation cross-domainrecommendation foundation-model gpt4rec language-model large-language-model llm llm-recommendation llm4rec multimodal multimodal-deep-learning multimodalrecommendation pre-training recommendation-system recommender-system transfer-learning transferable

Last synced: 21 Apr 2025

https://github.com/spotify-research/llark

Code for the paper "LLark: A Multimodal Instruction-Following Language Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, and Rachel Bittner.

foundation-model multimodal music-information-retrieval

Last synced: 17 Mar 2025

https://github.com/chao1224/MoleculeSTM

Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)

clip computation-chemistry drug-discovery editing foundation-model molecule-editing moleculeclip moleculestm pretraining retrieval

Last synced: 09 May 2025

https://github.com/chao1224/moleculestm

Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)

clip computation-chemistry drug-discovery editing foundation-model molecule-editing moleculeclip moleculestm pretraining retrieval

Last synced: 13 Apr 2025

https://github.com/zhanghm1995/Forge_VFM4AD

A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.

3dgs adaptation autonomous-driving diffusion end-to-end-autonomous-driving foundation-model large-language-models nerf pre-training survey world-models

Last synced: 24 Jul 2025

https://github.com/med-air/Endo-FM

[MICCAI'23] Foundation Model for Endoscopy Video Analysis via Large-scale Self-supervised Pre-train

endoscopy foundation-model large-scale miccai2023 pre-train self-supervised video

Last synced: 16 Mar 2025

https://github.com/mahmoodlab/mil-lab

Feather - Lightweight supervised slide foundation models (ICML 2025)

deep-learning foundation-model histology pathology whole-slide-image

Last synced: 16 Feb 2026

https://github.com/westlake-repl/PixelRec

A Large-scale Multimodal Dataset for recommender System

foundation-model image-recommendation large-image-dataset large-language-model llm4rec multimodal-recommendation multimodal-recommendation-dataset pre-train-recommendation recommender-system text-recommendation vision-recommendation visual-recommender-system

Last synced: 08 May 2025

https://github.com/naver/dune

Code repository for "DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers"

computer-vision foundation-model image-encoder knowledge-distillation vision-transformer

Last synced: 04 Apr 2026

https://github.com/sap-samples/btp-cap-genai-rag

Explore this repository for GenAI samples on SAP Business Technology Platform (SAP BTP). We provide examples for single and multitenant versions, showcasing integration of LLMs via SAP AI Core, LangChain in SAP CAP, and advanced techniques like Retrieval Augmented Generation (RAG).

4371 ai-core btp-use-case-factory cloud-foundry foundation-model genai generative-ai gpt hana kyma llm openai rag saas sample sample-code sap-btp sap-cap typescript vector-engine

Last synced: 31 Mar 2025

https://github.com/fajieyuan/SIGIR2021_Conure

Pre-training and Lifelong learning for User Embedding and Recommender System

catastrophic-forgetting continual-learning foundation-model general-purpose lifelong-learning lifelong-machine-learning one-for-all pre-training self-supervised-learning transfer-learning universal-recommender universal-representation user-profile user-representation

Last synced: 29 Apr 2025

https://github.com/chao1224/ProteinDT

ai4science drug-design foundation-model large-language-model llm protein protein-design protein-editing protein-sequence protein-structure

Last synced: 29 Apr 2025

https://github.com/chao1224/proteindt

ai4science drug-design foundation-model large-language-model llm protein protein-design protein-editing protein-sequence protein-structure

Last synced: 16 Mar 2025

https://github.com/nyumedml/headct_foundation

Foundation 3D ViT model for volumetric head CT

computer-vision computerized-tomography foundation-model self-supervised-learning

Last synced: 17 Mar 2026

https://github.com/alan-turing-institute/robots-in-disguise

Information and materials for the Turing's "robots-in-disguise" reading group on fundamental AI research.

deep-learning diffusion-models foundation-model hut23 language-models large-language-models machine-learning nlp transformers

Last synced: 21 Aug 2025

https://github.com/BoevaLab/CancerFoundation

CancerFoundation: A single-cell RNA sequencing foundation model to decipher drug resistance in cancer

cancer foundation-model single-cell

Last synced: 09 May 2026

https://github.com/yjyddq/eoser-ass-rl

Official Repository of "Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step"

foundation-model masked-diffusion-large-language-model reinforcement-learning

Last synced: 09 Oct 2025

https://github.com/automl/tempopfn

Official code release for the paper "TempoPFN: Synthetic Pre-training of Linear RNNs for Zero-shot Time Series Forecasting"

foundation-model synthetic-data-generation time-series-forecasting

Last synced: 28 Jan 2026

https://github.com/11yxk/SAM-LST

Pytorch implementation of paper Ladder Fine-tuning approach for SAM integrating complementary network.

fine-tuning foundation-model multi-organ-segmentation segment-anything

Last synced: 24 Jul 2025

https://github.com/thomasgust/molecumixer

Very incomplete right now, pretrained ARGVAET system for generating, classifying, and predicting the properties of molecules. I couldn't upload the dataset or checkpoints due to size constraints.

argvaet bioinformatics foundation-model generative-ai generative-pretraining gnn molecule neural-network pretraining pytorch rdkit

Last synced: 23 Oct 2025

https://github.com/itrummer/naturalminer

Mine data for patterns described in natural language

data-mining data-science foundation-model language-model nlp

Last synced: 07 Mar 2026

https://github.com/garystafford/genai_fiction_summary

Mastering Long Document Insights: Advanced Summarization with Amazon Bedrock and Anthropic Claude Foundation Model

anthropic-claude foundation-model generative-ai text-summarization

Last synced: 27 Mar 2025

https://github.com/chansigit/scgpt-modern

A drop-in modernization of bowang-lab/scGPT for Python 3.12 + torch 2.6 + flash-attn 3 (H100 sm_90a native). Original pretrained weights load unmodified — compatible, more modern, faster.

bioinformatics deep-learning flash-attention foundation-model genomics h100 hopper llm pytorch rna-seq scgpt single-cell single-cell-genomics transcriptomics

Last synced: 29 Apr 2026

https://github.com/pointcloudyc/Industrial3D

Industrial3D: A Terrestrial LiDAR Point Cloud Dataset and Cross-Paradigm Benchmark for Industrial Infrastructure

benchmark digital-construction foundation-model point-cloud scan-to-bim unsupervised-segmentation weakly-supervised-segmentation

Last synced: 20 Apr 2026

https://github.com/2536ivan/nature

Simple Tailwind CSS and JavaScript project, Nature.

bio-inspired-optimization compiler deep-learning foundation-model histopathology language mahmoodlab metaheuristics nature-inspired-algorithms optimization postgis postgresql qiskit swarm-intelligence

Last synced: 09 May 2025