Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Awesome-AI-Systems
Resources for recent AI systems (deployment concerns, cost and accessibility). -- closed
https://github.com/Jason-cs18/Awesome-AI-Systems
Last synced: 3 days ago
JSON representation
-
Deployment Concerns
-
Popular approaches (todo, summary)
- Cryptography for Safe Machine Learning. In MLSys'20. - goldwasser) presented some techniques about cryptography in machine learning.
- Telekine: Secure Computing with Cloud GPUs. In NSDI'20.
- Themis: Fair and Efficient GPU Cluster Scheduling. In NSDI'20.
- Federated Optimization in Heterogeneous Networks. In MLSys'20. - identical distributions).<br>
- What is the State of Neural Network Pruning? In MLSys'20. - source framework named ShrinkBench to **evaluate pruning methods**.<br>
- Attention-based Learning for Missing Data Imputation in HoloClean. In MLSys'20.
- A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms. In MLSys'20.
- MLPerf Training Benchmark. In MLSys'20.
- FLEET: Flexible Efficient Ensemble Training for Heterogeneous Deep Neural Networks. In MLSys'20.
-
-
Cost
-
Popular approaches (todo, summary)
- Theory & Systems for Weak Supervision. In MLSys'20.
- Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems. In MLSys'20. - scale parameters** for massive scale deep learnning ads systems. <br>
- Resource Elasticity in Distributed Deep Learning. In MLSys'20.
- Breaking the Memory Wall with Optimal Tensor Rematerialization. In MLSys'20. - constraint environment**.<br>
- SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems. In MLSys'20. - efficient** method to deliver the state-of-the-art detection accuracy and speed for embedded systems. <br>
- Fine-Grained GPU Sharing Primitives for Deep Learning Applications. In MLSys'20. - grained GPU sharing methods when multiple DL workloads accessed the same GPU but they only tested some simple scheduling algorithms (FIFO, SRTF, PACK and FAIR).** From my perspective, scheduling methods can be customized for specific applications because different context help us design or implement the most suitable scheduling algorithms. [Note](https://github.com/YanLu-nyu/Awesome-AI-Systems/blob/master/Notes/Salus_MLSys20.md) <br>
- Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc. In MLSys'20. - GPU framework for fast GNN training and inference on graphs.<br>
- OPTIMUS: OPTImized matrix MUltiplication Structure for Transformer neural network accelerator. In MLSys'20.
- Memory-Driven Mixed Low Precision Quantization for Enabling Deep Network Inference on Microcontrollers. In MLSys'20. - to-end methodology for enabling the deployment of high-accuracy deep networks on **microcontrollers** through mixed low-bitwidth compression and integer-only operations.<br>
- Riptide: Fast End-to-End Binarized Neural Networks. In MLSys'20.
- Searching for Winograd-aware Quantized Networks. In MLSys'20.
- Blink: Fast and Generic Collectives for Distributed ML. In MLSys'20.
- MotherNets: Rapid Deep Ensemble Learning. In MLSys'20.
- Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference. In MLSys'20. - K queries**. [Note](https://github.com/YanLu-nyu/Awesome-AI-Systems/blob/master/Notes/Willump.md)<br>
- PoET-BiN: Power Efficient Tiny Binary Neurons. In MLSys'20. - up Table based power** efficient implementation on resource-constrained embedding devices.<br>
- Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks. In MLSys'20.
- Improving Resource Efficiency of Deep Activity Recognition via Redundancy Reduction. In HotMobile'20. - nyu/Awesome-AI-Systems/blob/master/Notes/HAR_HotMobile_20.md)<br>
- Server-Driven Video Streaming for Deep Learning Inference. In SIGCOMM'20.
- Reducto: On-Camera Filtering for Resource-Efficient Real-Time Video Analytics. In SIGCOMM'20. - off resource usage and accuracy of real-time video analytics.
- SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems. In MLSys'20. - Linear Deep Learning Engine named SLIDE to handle **fast training on large datasets and efficient utilization on the current hardware**. This engine blends smart randomized algorithms with multicore parallelism and workload optimization.<br>
-
-
Accessibility
-
Popular approaches (todo, summary)
- A System for Massively Parallel Hyperparameter Tuning. In MLSys'20. - scale hyperparameter optimization problems in distributed training. <br>
- PLink: Discovering and Exploiting Locality for Accelerated Distributed Training on the public Cloud. In MLSys'20.
- BPPSA: Scaling Back-propagation by Parallel Scan Algorithm. In MLSys'20. - propagation (BP) algorithm** into a scan operation to handle the limitation of BP in a parallel computing environment.<br>
- MNN: A Universal and Efficient Inference Engine. In MLSys'20.
-
-
Books for Deep Learning (a popular learning approaches in machine learning)
-
Course
-
Popular approaches (todo, summary)
- CSE 599W: Systems for ML - level optimization in Deep Learning frameworks.
- EECS 598: Systems for AI (W'20)
-
-
Conference
-
Tools
-
Popular approaches (todo, summary)
-