An open API service indexing awesome lists of open source software.

awesome-ai

A curated list of AI tools, courses, books, and resources for anyone interested in exploring artificial intelligence, machine learning, and deep learning.
https://github.com/nickvasdev/awesome-ai

Last synced: 11 days ago
JSON representation

  • Books

  • Courses

  • Learning Resources

  • Newsletters

    • Awesome GitHub Resources

  • On-Device AI

    • Efficient Architectures for On-Device LLMs

      • MobileLLM - High accuracy, optimized for sub-billion parameter models, embedding sharing, grouped-query attention, reduced model size.
      • EdgeShard - Up to 50% latency reduction, collaborative edge-cloud computing, optimal shard placement, distributed model components reduce individual device load.
      • LLMCad - Up to 9.3× speedup in token generation, generate-then-verify, token tree generation, smaller LLM for token generation, larger LLM for verification.
      • Any-Precision LLM - Supports multiple precisions efficiently, post-training quantization, memory-efficient design, substantial memory savings with versatile model precisions.
      • Breakthrough Memory - Up to 4.5× performance improvement, PIM and PNM technologies enhance memory processing, enhanced memory bandwidth and capacity.
      • MELTing Point - Provides systematic performance evaluation, analyzes impacts of quantization, efficient model evaluation, evaluates memory and computational efficiency trade-offs.
      • LLMaaS on device - Reduces context switching latency significantly, stateful execution, fine-grained KV cache compression, efficient memory management with tolerance-aware compression and swapping.
      • LocMoE - Reduces training time per epoch by up to 22.24%, orthogonal gating weights, locality-based expert regularization, minimizes communication overhead with group-wise All-to-All and recompute pipeline.
      • EdgeMoE - Significant performance improvements on edge devices, expert-wise bitwidth adaptation, preloading experts, efficient memory management through expert-by-expert computation reordering.
      • JetMoE - Outperforms Llama27B and 13B-Chat with fewer parameters, reduces inference computation by 70% using sparse activation, 8B total parameters, only 2B activated per input token.
      • Pangu-$\pi$ Pro - Neural architecture, parameter initialization, and optimization strategy for billion-level parameter models, embedding sharing, tokenizer compression, reduced model size via architecture tweaking.
      • Zamba2 - 2x faster time-to-first-token, a 27% reduction in memory overhead, and a 1.29x lower generation latency compared to Phi3-3.8B, hybrid Mamba2/Attention architecture and shared transformer block, 2.7B parameters, fewer KV-states due to reduced attention.
    • Evolution of On-Device LLMs

      • Tinyllama - Open-source small language model.
      • MobileVLM V2 - Faster and stronger baseline for Vision Language Model.
      • MobileAIBench - Benchmarking LLMs and LMMs for on-device use cases.
      • Octopus series papers - On-device language models for different applications. [[Octopus v2]](https://arxiv.org/abs/2404.01744) [[Octopus v3]](https://arxiv.org/abs/2404.11459) [[Octopus v4]](https://arxiv.org/abs/2404.19296) [[Github]](https://github.com/NexaAI).
      • The Era of 1-bit LLMs - All large language models are in 1.58 bits.
      • AWQ - Activation-aware weight quantization for LLM compression and acceleration. [[Github]](https://github.com/mit-han-lab/llm-awq).
      • Small Language Models - Survey, measurements, and insights.
    • General Efficiency and Performance Improvements

    • Limitations of Cloud-Based LLM Inference and Advantages of On-Device Inference

    • LLM Architecture Foundations

    • Memory and Computational Efficiency

    • On-Device LLMs Training

      • OpenELM - An efficient language model family with open training and inference framework. [[Github]](https://github.com/apple/corenet).
    • The Performance Indicator of On-Device LLMs

      • MNN - A lightweight deep neural network inference engine.
      • PowerInfer-2 - Fast large language model inference on a smartphone. [[Github]](https://github.com/SJTU-IPADS/PowerInfer).
      • llama.cpp - Lightweight library for approximate nearest neighbors and maximum inner product search.
      • Powerinfer - Fast large language model serving with a consumer-grade GPU. [[Github]](https://github.com/SJTU-IPADS/PowerInfer).
  • Tools

    • Chat

      • Chat GPT - A free-to-use AI system that allows users to engage in conversations, gain insights, automate tasks, and witness the future of AI all in one place.
      • Gemini - Direct access to Google AI for writing, planning, learning, and more.
    • Commercial Tools

      • Taskade - Build, train, and deploy AI agents to automate tasks, research, and collaborate in real-time.
    • Images

      • Midjourney - AI image generation.
      • DALL·E 3 - AI system that creates realistic images and art from a natural-language description.
    • Video

      • Sora - Text-to-video AI model that creates imaginative scenes from text.
      • Runway - AI video generation.
  • Videos

Programming Languages