Awesome-AI-Systems

Resources for recent AI systems (deployment concerns, cost and accessibility). -- closed
https://github.com/Jason-cs18/Awesome-AI-Systems

Last synced: 4 days ago
JSON representation

Deployment Concerns
- Popular approaches (todo, summary)
  - Cryptography for Safe Machine Learning. In MLSys'20. - goldwasser) presented some techniques about cryptography in machine learning.
  - Telekine: Secure Computing with Cloud GPUs. In NSDI'20.
  - Themis: Fair and Efficient GPU Cluster Scheduling. In NSDI'20.
  - Federated Optimization in Heterogeneous Networks. In MLSys'20. - identical distributions).<br>
  - What is the State of Neural Network Pruning? In MLSys'20. - source framework named ShrinkBench to **evaluate pruning methods**.<br>
  - Attention-based Learning for Missing Data Imputation in HoloClean. In MLSys'20.
  - A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms. In MLSys'20.
  - MLPerf Training Benchmark. In MLSys'20.
  - FLEET: Flexible Eﬃcient Ensemble Training for Heterogeneous Deep Neural Networks. In MLSys'20.
Cost
- Popular approaches (todo, summary)
  - Theory & Systems for Weak Supervision. In MLSys'20.
  - Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems. In MLSys'20. - scale parameters** for massive scale deep learnning ads systems. <br>
  - Resource Elasticity in Distributed Deep Learning. In MLSys'20.
  - Breaking the Memory Wall with Optimal Tensor Rematerialization. In MLSys'20. - constraint environment**.<br>
  - SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems. In MLSys'20. - efficient** method to deliver the state-of-the-art detection accuracy and speed for embedded systems. <br>
  - Fine-Grained GPU Sharing Primitives for Deep Learning Applications. In MLSys'20. - grained GPU sharing methods when multiple DL workloads accessed the same GPU but they only tested some simple scheduling algorithms (FIFO, SRTF, PACK and FAIR).** From my perspective, scheduling methods can be customized for specific applications because different context help us design or implement the most suitable scheduling algorithms. [Note](https://github.com/YanLu-nyu/Awesome-AI-Systems/blob/master/Notes/Salus_MLSys20.md) <br>
  - Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc. In MLSys'20. - GPU framework for fast GNN training and inference on graphs.<br>
  - OPTIMUS: OPTImized matrix MUltiplication Structure for Transformer neural network accelerator. In MLSys'20.
  - Memory-Driven Mixed Low Precision Quantization for Enabling Deep Network Inference on Microcontrollers. In MLSys'20. - to-end methodology for enabling the deployment of high-accuracy deep networks on **microcontrollers** through mixed low-bitwidth compression and integer-only operations.<br>
  - Riptide: Fast End-to-End Binarized Neural Networks. In MLSys'20.
  - Searching for Winograd-aware Quantized Networks. In MLSys'20.
  - Blink: Fast and Generic Collectives for Distributed ML. In MLSys'20.
  - MotherNets: Rapid Deep Ensemble Learning. In MLSys'20.
  - Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference. In MLSys'20. - K queries**. [Note](https://github.com/YanLu-nyu/Awesome-AI-Systems/blob/master/Notes/Willump.md)<br>
  - Improving Resource Efficiency of Deep Activity Recognition via Redundancy Reduction. In HotMobile'20. - nyu/Awesome-AI-Systems/blob/master/Notes/HAR_HotMobile_20.md)<br>
  - Reducto: On-Camera Filtering for Resource-Efficient Real-Time Video Analytics. In SIGCOMM'20. - off resource usage and accuracy of real-time video analytics.
  - Server-Driven Video Streaming for Deep Learning Inference. In SIGCOMM'20.
  - SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems. In MLSys'20. - Linear Deep Learning Engine named SLIDE to handle **fast training on large datasets and eﬃcient utilization on the current hardware**. This engine blends smart randomized algorithms with multicore parallelism and workload optimization.<br>
  - PoET-BiN: Power Eﬃcient Tiny Binary Neurons. In MLSys'20. - up Table based power** eﬃcient implementation on resource-constrained embedding devices.<br>
  - Trained Quantization Thresholds for Accurate and Eﬃcient Fixed-Point Inference of Deep Neural Networks. In MLSys'20.
Accessibility
- Popular approaches (todo, summary)
  - A System for Massively Parallel Hyperparameter Tuning. In MLSys'20. - scale hyperparameter optimization problems in distributed training. <br>
  - PLink: Discovering and Exploiting Locality for Accelerated Distributed Training on the public Cloud. In MLSys'20.
  - BPPSA: Scaling Back-propagation by Parallel Scan Algorithm. In MLSys'20. - propagation (BP) algorithm** into a scan operation to handle the limitation of BP in a parallel computing environment.<br>
  - MNN: A Universal and Eﬃcient Inference Engine. In MLSys'20.
Books for Deep Learning (a popular learning approaches in machine learning)
- Popular approaches (todo, summary)
Course
- Popular approaches (todo, summary)
  - CSE 599W: Systems for ML - level optimization in Deep Learning frameworks.
  - EECS 598: Systems for AI (W'20)
Conference
- Popular approaches (todo, summary)
Tools
- Popular approaches (todo, summary)
  - TVM: End to End Deep Learning Compiler Stack

Categories

Cost 20 Deployment Concerns 9 Accessibility 4 Books for Deep Learning (a popular learning approaches in machine learning) 3 Conference 3 Course 2 Tools 1

Sub Categories

Popular approaches (todo, summary) 42