Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-dl-development
A collection of deep learning development (notes, courses, papers and tools).
https://github.com/jason-cs18/awesome-dl-development
Last synced: 4 days ago
JSON representation
-
Uncategorized
-
Uncategorized
- (Nature, 2023) Parameter-efficient fine-tuning of large-scale pre-trained language models
- (Nature, 2023) Parameter-efficient fine-tuning of large-scale pre-trained language models
- (SenSys'22 Workshop) Towards Data-Efficient Continuous Learning for Edge Video Analytics via Smart Caching
- (VLDB'20) ODIN: Automated drift detection and recovery in video analytics
- CMU 10-414/714: Deep Learning Systems (Fall 2022)
- Machine Learning Compilation (Fall 2022)
- Towards AGI: Scaling, Alignment & Emergent Behaviors in Neural Nets (Winter 2023)
- UCB CS294 AISys: Machine Learning Systems (Spring 2022)
- Dive into Deep Learning (vol. 2) - cs18/Awesome-DL-Development/tree/main/Book/D2L)
- Understanding Deep Learning (UCL 2023)
- Computer Architectures: An Quantitative Approach (6th edition)
- Computer Systems: A Programmer's Perspective (2nd edition)
- Computer Networking: A Top-Down Approach (7th edition)
- Pytorch - cs18/Awesome-DL-Development/blob/main/Tools/Pytorch/README.md)
- HuggingFace - cs18/Awesome-DL-Development/blob/main/Tools/HuggingFace/README.md)
- (SenSys'20) Distream: scaling live video analytics with workload-adaptive distributed edge intelligence
- (SEC'20 Best Paper Award) Spatula: Efficient cross-camera video analytics on large camera networks
- (SEC'19) Collaborative Learning between Cloud and End Devices: An Empirical Study on Location Prediction
- Efficient AI
- Efficient Transformers: A Survey (2018)
- Efficiency 360: Efficient Vision Transformers (2023)
- Harvard CS197: AI Research Experience (Fall 2022) - cs18/Awesome-DL-Development/blob/main/Course/Harvard_CS197/readme.md)
- Towards AGI: Scaling, Alignment & Emergent Behaviors in Neural Nets (Winter 2023)
- UCB CS294 AISys: Machine Learning Systems (Spring 2022)
- Dive into Deep Learning (vol. 2) - cs18/Awesome-DL-Development/tree/main/Book/D2L)
- Understanding Deep Learning (UCL 2023)
- Computer Architectures: An Quantitative Approach (6th edition)
- Computer Systems: A Programmer's Perspective (2nd edition)
- Computer Networking: A Top-Down Approach (7th edition)
- Pytorch - cs18/Awesome-DL-Development/blob/main/Tools/Pytorch/README.md)
- HuggingFace - cs18/Awesome-DL-Development/blob/main/Tools/HuggingFace/README.md)
- Pytorch Lightning - AI/lightning) (a scalable DL framework for academics and industry) [Notes (in progress)](https://github.com/Jason-cs18/Awesome-DL-Development/blob/main/Tools/Pytorch-Lighning/README.md)
- Alibaba MNN - source inference engine for mobile devices)
- NVIDIA TAO
- NVIDIA TensorRT
- OpenAI Triton - source Python-like programming language to write highly efficient GPU code without CUDA programming experience) [Notes (in progress)](https://github.com/Jason-cs18/Awesome-DL-Development/blob/main/Tools/OpenAI_Triton/readme.md)
- Submission notices
- DL & DLSys basics
- Edge-AI-Paper-List
- Machine Learning at Berkeley Reading List
- A reading list for machine learning systems
- Deep Learning for Generic Object Detection: A Survey (2018)
- Transformer Models: An Introduction and Catelog (2023)
- Full Stack Optimization of Transformer Inference: a Survey (2023)
- Reliable AI
- (NSDI'23) RECL: Responsive Resource-Efficient Continuous Learning for Video Analytics
- (TON'22) Scheduling Massive Camera Streams to Optimize Large-Scale Live Video Analytics
- (InfoCom'22) ComAI: Enabling Lightweight, Collaborative Intelligence by Retrofitting Vision DNNs
- (ICDCS'22) Multi-View Scheduling of Onboard Live Video Analytics to Minimize Frame Processing Latency
- Pytorch Lightning - AI/lightning) (a scalable DL framework for academics and industry) [Notes (in progress)](https://github.com/Jason-cs18/Awesome-DL-Development/blob/main/Tools/Pytorch-Lighning/README.md)
- Alibaba MNN - source inference engine for mobile devices)
- NVIDIA TAO
- NVIDIA TensorRT
- OpenAI Triton - source Python-like programming language to write highly efficient GPU code without CUDA programming experience) [Notes (in progress)](https://github.com/Jason-cs18/Awesome-DL-Development/blob/main/Tools/OpenAI_Triton/readme.md)
- Submission notices
- DL & DLSys basics
- Edge-AI-Paper-List
- Machine Learning at Berkeley Reading List
- A reading list for machine learning systems
- Deep Learning for Generic Object Detection: A Survey (2018)
- Transformer Models: An Introduction and Catelog (2023)
- Full Stack Optimization of Transformer Inference: a Survey (2023)
- Reliable AI
- (ICRA'19) Memory efficient experience replay for streaming learning
- (CVPR'22) Proper Reuse of Image Classification Features Improves Object Detection
- (Nature, 2023) Parameter-efficient fine-tuning of large-scale pre-trained language models
- (NSDI'22) Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers
- (IEEE IOT 2022) Cost-Efficient Continuous Edge Learning for Artificial Intelligence of Things
- (NSDI'23) RECL: Responsive Resource-Efficient Continuous Learning for Video Analytics
- (ICCV'21) Real-Time Video Inference on Edge Devices via Adaptive Model Streaming
- (SenSys'22) Turbo: Opportunistic Enhancement for Edge Video Analytics
- Harvard CS197: AI Research Experience (Fall 2022) - cs18/Awesome-DL-Development/blob/main/Course/Harvard_CS197/readme.md)
- CMU 10-414/714: Deep Learning Systems (Fall 2022)
- Machine Learning Compilation (Fall 2022)
- (ICRA'19) Memory efficient experience replay for streaming learning
- (CVPR'22) Proper Reuse of Image Classification Features Improves Object Detection
- (Nature, 2023) Parameter-efficient fine-tuning of large-scale pre-trained language models
- (NSDI'22) Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers
- (IEEE IOT 2022) Cost-Efficient Continuous Edge Learning for Artificial Intelligence of Things
- (ICCV'21) Real-Time Video Inference on Edge Devices via Adaptive Model Streaming
- (SenSys'22) Turbo: Opportunistic Enhancement for Edge Video Analytics
- (SECON'22) Focus! Provisioning Attention-aware Detection for Real-time On-device Video Analytics
- (AAAI 2023) Towards Inference Efficient Deep Ensemble Learning
- (NeurIPS'22) Deep Ensembles Work, But Are They Necessary?
- (ICLR'22) Deep Ensembling with No Overhead of either Training or Testing: The All Round Blessings of Dynamic Sparsity
- (arXiv 2022) SANE: Specialization-Aware Neural Network Ensemble
- (NSDI'22) Check-N-Run: a Checkpointing System for Training Deep Learning Recommendation Models
- (NSDI'22) Cocktail: A Multidimensional Optimization for Model Serving in Cloud
- (InfoCom'23) Cross-Camera Inference on the Constrained Edge
- (AAAI'23 Oral) Multi-View Domain Adaptive Object Detection in Surveillance Cameras
- (SECON'22) Focus! Provisioning Attention-aware Detection for Real-time On-device Video Analytics
- (AAAI 2023) Towards Inference Efficient Deep Ensemble Learning
- (NeurIPS'22) Deep Ensembles Work, But Are They Necessary?
- (ICLR'22) Deep Ensembling with No Overhead of either Training or Testing: The All Round Blessings of Dynamic Sparsity
- (arXiv 2022) SANE: Specialization-Aware Neural Network Ensemble
- (NSDI'22) Check-N-Run: a Checkpointing System for Training Deep Learning Recommendation Models
- (NSDI'22) Cocktail: A Multidimensional Optimization for Model Serving in Cloud
- (InfoCom'23) Cross-Camera Inference on the Constrained Edge
- (AAAI'23 Oral) Multi-View Domain Adaptive Object Detection in Surveillance Cameras
- (TON'22) Scheduling Massive Camera Streams to Optimize Large-Scale Live Video Analytics
- (InfoCom'22) ComAI: Enabling Lightweight, Collaborative Intelligence by Retrofitting Vision DNNs
- (ICDCS'22) Multi-View Scheduling of Onboard Live Video Analytics to Minimize Frame Processing Latency
- (CVPR'23) Stitchable Neural Networks
- (MobiCom'21) LegoDNN: Block-Grained Scaling of DeepNeural Networks for Mobile Vision
- awesome-mixture-of-experts - mixture-of-experts#awesome-mixture-of-experts)
- (SenSys'20) Distream: scaling live video analytics with workload-adaptive distributed edge intelligence
- (SEC'20 Best Paper Award) Spatula: Efficient cross-camera video analytics on large camera networks
- (SEC'19) Collaborative Learning between Cloud and End Devices: An Empirical Study on Location Prediction
- Efficient AI
- Efficient Transformers: A Survey (2018)
- Efficiency 360: Efficient Vision Transformers (2023)
- (CVPR'20) EfficientDet: Scalable and Efficient Object Detection
- (CVPR'23) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
- (ICLR'20) Once for All: Train One Network and Specialize it for Efficient Deployment
- (ICLR'22) Auto-scaling Vision Transformers without Training
- (2022) Task-Specific Expert Pruning for Sparse Mixture-of-Experts
- (MobiCom'23) AdaptiveNet: Post-deployment Neural ArchitectureAdaptation for Diverse Edge Environments
- (2022) Mixture-of-Experts with Expert Choice Routing
- (2022) ST-MOE: DESIGNING STABLE AND TRANSFERABLE SPARSE EXPERT MODELS
- (2022) Towards Understanding the Mixture-of-Experts Layer in Deep Learning
- (ICLR'21 Spotlight) Long-tailed Recognition by Routing Diverse Distribution-Aware Experts
- (ECCV'20) Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification
- (CVPR'20) Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax
- (CVPR'20) BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
- Awesome Tensor Compilers - tensor-compilers)
- (MobiSys'23) Understanding and Optimizing Deep Learning Cold-Start Latency on Edge Devices
- (MobiCom'22) Romou: Rapidly Generate High-Performance Tensor Kernels for Mobile GPUs
- (NSDI'23) GEMEL: Model Merging for Memory-Efficient, Real-Time Video Analytics at the Edge
- (ECCV'20) Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification
- (CVPR'20) Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax
- (CVPR'20) BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
- Awesome Tensor Compilers - tensor-compilers)
- (CVPR'20) EfficientDet: Scalable and Efficient Object Detection
- (CVPR'23) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
- (ICLR'20) Once for All: Train One Network and Specialize it for Efficient Deployment
- (ICLR'22) Auto-scaling Vision Transformers without Training
- (MobiCom'23) AdaptiveNet: Post-deployment Neural ArchitectureAdaptation for Diverse Edge Environments
- (CVPR'23) Stitchable Neural Networks
- (MobiCom'21) LegoDNN: Block-Grained Scaling of DeepNeural Networks for Mobile Vision
- awesome-mixture-of-experts - mixture-of-experts#awesome-mixture-of-experts)
- (2022) Task-Specific Expert Pruning for Sparse Mixture-of-Experts
- (2022) Mixture-of-Experts with Expert Choice Routing
- (2022) ST-MOE: DESIGNING STABLE AND TRANSFERABLE SPARSE EXPERT MODELS
- (2022) Towards Understanding the Mixture-of-Experts Layer in Deep Learning
- (ICLR'21 Spotlight) Long-tailed Recognition by Routing Diverse Distribution-Aware Experts
- (MobiSys'23) Understanding and Optimizing Deep Learning Cold-Start Latency on Edge Devices
- (MobiCom'22) Romou: Rapidly Generate High-Performance Tensor Kernels for Mobile GPUs
- (ATC'22) Tetris: Memory-efficient Serverless Inference through Tensor Sharing
- (NSDI'23) GEMEL: Model Merging for Memory-Efficient, Real-Time Video Analytics at the Edge
- (OSDI'22) Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences
- (SenSys'22) BlastNet: Exploiting Duo-Blocks for Cross-Processor Real-Time DNN Inference
- (MobiSys'22) CoDL: Efficient CPU-GPU Co-execution for Deep Learning Inference on Mobile Devices
- (MobiSys'22) Band: coordinated multi-DNN inference on heterogeneous mobile processors
- (ATC'22) Tetris: Memory-efficient Serverless Inference through Tensor Sharing
- (OSDI'22) Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences
- (SenSys'22) BlastNet: Exploiting Duo-Blocks for Cross-Processor Real-Time DNN Inference
- (MobiSys'22) CoDL: Efficient CPU-GPU Co-execution for Deep Learning Inference on Mobile Devices
- (MobiSys'22) Band: coordinated multi-DNN inference on heterogeneous mobile processors
- (RTSS'22) Jellyfish: Timely Inference Serving for Dynamic Edge Networks
- (RTSS'19) Pipelined Data-Parallel CPU/GPU Scheduling for Multi-DNN Real-Time Inference
- (RTSS'22) Jellyfish: Timely Inference Serving for Dynamic Edge Networks
- (RTSS'19) Pipelined Data-Parallel CPU/GPU Scheduling for Multi-DNN Real-Time Inference
- Transformer Models: An Introduction and Catelog (2023)
- (Nature, 2023) Parameter-efficient fine-tuning of large-scale pre-trained language models
- (AAAI 2023) Towards Inference Efficient Deep Ensemble Learning
- Efficiency 360: Efficient Vision Transformers (2023)
- (CVPR'23) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
- (ICLR'22) Auto-scaling Vision Transformers without Training
- (2022) Task-Specific Expert Pruning for Sparse Mixture-of-Experts
-
Programming Languages
Categories
Sub Categories
Keywords
deep-learning
6
machine-learning
4
arm
2
convolution
2
deep-neural-networks
2
embedded-devices
2
ml
2
mnn
2
vulkan
2
winograd-algorithm
2
gpu-acceleration
2
inference
2
nvidia
2
tensorrt
2
edge-inference
2
knowledge-distillation
2
real-time
2
semantic-segmentation
2
tensorflow
2
video-inference
2
mobile
2
vision
2
code-generation
2
compiler
2
high-performance-computing
2
programming-language
2
tensor
2