amx-guide

Advanced Matrix Extensions (AMX) Guide
https://github.com/mikeroyal/amx-guide

Last synced: 5 days ago
JSON representation

Linear Algebra Learning Resources
LLVM Learning Resources
- Clang - end and tooling infrastructure for languages in the C language family (C, C++, Objective C/C++, OpenCL, CUDA, and RenderScript) for the LLVM project.
- LLVM Documentation
- LLVM | Apple Developer Forums
- Contributing to LLVM
- Getting Started with LLVM
- Getting Started with Clang
- How To Setup Clang Tooling For LLVM
- Using Clang-Tidy in Visual Studio
- Configure VS Code for Clang/LLVM on macOS
- LLVM Project GitHub
LLVM Tools, Libraries and Frameworks
- Back to the Top
- TinyGo - line tools.
- Visual Studio Code
- Code Server
- Clang-Format - C/Objective-C++/Protobuf code.
- Clang-Tidy - based C++ "linter" tool. Its purpose is to provide an extensible framework for diagnosing and fixing typical programming errors, like style violations, interface misuse, or bugs that can be deduced via static analysis. clang-tidy is modular and provides a convenient interface for writing new checks.
- Clangd
- LLD - in replacement for system linkers and runs much faster than them. It also provides features that are useful for toolchain developers. The linker supports ELF (Unix), PE/COFF (Windows), Mach-O (macOS) and WebAssembly in descending order.
- FileCheck
- tblgen
- clang-tblgen
- lldb-tblgen
- llvm-tblgen
- mlir-tblgen
- lit
- llvm-exegesis
- llvm-locstats
- llvm-pdbutil
- llvm-profgen
- bugpoint
- llvm-extract
- llvm-bcanalyzer
- llvm-addr2line - in replacement for addr2line.
- llvm-ar
- llvm-cxxfilt
- llvm-install-name-tool - names and rpaths.
- llvm-nm
- llvm-objcopy
- llvm-objdump
- llvm-ranlib
- llvm-readelf - style LLVM Object Reader.
- llvm-size
- llvm-strings
- llvm-strip
ML Frameworks, Libraries, and Tools
- viii. Linear Regression
  - Amazon SageMaker
  - Azure Databricks - based big data analytics service designed for data science and data engineering. Azure Databricks, sets up your Apache Spark environment in minutes, autoscale, and collaborate on shared projects in an interactive workspace. Azure Databricks supports Python, Scala, R, Java, and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch, and scikit-learn.
  - Apple CoreML - tune models, all on the user's device. A model is the result of applying a machine learning algorithm to a set of training data. You use a model to make predictions based on new input data.
  - Tensorflow_macOS - optimized version of TensorFlow and TensorFlow Addons for macOS 11.0+ accelerated using Apple's ML Compute framework.
  - Anaconda
  - PlaidML
  - OpenCV - time computer vision applications. The C++, Python, and Java interfaces support Linux, MacOS, Windows, iOS, and Android.
  - Scikit-Learn
  - Caffe
  - Theano - dimensional arrays efficiently including tight integration with NumPy.
  - nGraph - of-use to AI developers.
  - Apache Spark Connector for SQL Server and Azure SQL - performance connector that enables you to use transactional data in big data analytics and persists results for ad-hoc queries or reporting. The connector allows you to use any SQL database, on-premises or in the cloud, as an input data source or output data sink for Spark jobs.
  - Eclipse Deeplearning4J (DL4J) - based(Scala, Kotlin, Clojure, and Groovy) deep learning application. This means starting with the raw data, loading and preprocessing it from wherever and whatever format it is in to building and tuning a wide variety of simple and complex deep learning networks.
  - TensorFlow - to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.
OpenCL Learning Resources
- viii. Linear Regression
  - Open Computing Language (OpenCL) - to-parallel-computing-zNrIS) of heterogeneous platforms consisting of CPUs, GPUs, and other hardware accelerators found in supercomputers, cloud servers, personal computers, mobile devices and embedded platforms.
  - OpenCL | NVIDIA Developer
  - Introduction to OpenCL on FPGAs Course | Coursera
  - Compiling OpenCL Kernel to FPGAs Course | Coursera
  - OpenCL Tutorials - StreamHPC
  - Introduction to Intel® OpenCL Tools
  - Khronos Group | GitHub
OpenCL Tools, Libraries and Frameworks
- viii. Linear Regression
  - GPUVerify
  - OpenCL ICD Loader
  - clBLAS
  - clFFT
  - clSPARSE
  - clRNG
  - CLsmith - core environment, OpenCL. Its primary feature is the generation of random OpenCL kernels, exercising many features of the language. It also brings a novel idea of applying EMI, via dead-code injection.
  - Oclgrind - races and barrier divergence, collecting instruction histograms, and for interactive OpenCL kernel debugging. The simulator is built on an interpreter for LLVM IR.
  - NVIDIA® Nsight™ Visual Studio Edition
  - Radeon™ GPU Profiler
  - Radeon™ GPU Analyzer
  - AMD Radeon ProRender - based rendering engine that enables creative professionals to produce stunningly photorealistic images on virtually any GPU, any CPU, and any OS in over a dozen leading digital content creation and CAD applications.
  - Intel® SDK For OpenCL™ Applications - intensive workloads. Customize heterogeneous compute applications and accelerate performance with kernel-based programming.
  - NVIDIA cuDNN - accelerated library of primitives for [deep neural networks](https://developer.nvidia.com/deep-learning). cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN accelerates widely used deep learning frameworks, including [Caffe2](https://caffe2.ai/), [Chainer](https://chainer.org/), [Keras](https://keras.io/), [MATLAB](https://www.mathworks.com/solutions/deep-learning.html), [MxNet](https://mxnet.incubator.apache.org/), [PyTorch](https://pytorch.org/), and [TensorFlow](https://www.tensorflow.org/).
  - NVIDIA Container Toolkit - container) and utilities to automatically configure containers to leverage NVIDIA GPUs.
Other Linear Topics
- i. Basis
  - wikimedia
  - wikimedia
- iii. Dimension and Basis for Vector Spaces
  - sliderserve
- ii. Matrix representations of linear transformations
  - slideserve
- iv. Row space, columns space, and rank of a matrix
  - slideshare
  - slideshare
- vi. Determinants
  - stackexchange
  - onlinemathlearning
- vii. Eigenvalues and eigenvectors
  - YouTube
- viii. Linear Regression
  - Linear regression
  - Medium
  - PV(ParaVirtualization) - assisted virtualization.
  - Virtualized Infrastructure Manager (VIM)
  - Management and Orchestration(MANO) - hosted initiative to develop an Open Source NFV Management and Orchestration (MANO) software stack aligned with ETSI NFV. Two of the key components of the ETSI NFV architectural framework are the NFV Orchestrator and VNF Manager, known as NFV MANO.
  - OpenRAN - vendor deployments.
  - Open vSwitch(OVS)
  - Edge
  - Multi-access edge computing (MEC) - parties across multi-vendor Multi-access Edge Computing platforms.
  - Cloud-Native Network Functions(CNF)
  - Physical Network Function(PNF)
  - KVM (for Kernel-based Virtual Machine) - V). It consists of a loadable kernel module, kvm.ko, that provides the core virtualization infrastructure and a processor specific module, kvm-intel.ko or kvm-amd.ko.
  - VirtManager
  - HyperKit - level components such as the [VPNKit](https://github.com/moby/vpnkit) and [DataKit](https://github.com/moby/datakit). HyperKit currently only supports macOS using the [Hypervisor.framework](https://developer.apple.com/library/mac/documentation/DriversKernelHardware/Reference/Hypervisor/index.html) making it a core component of Docker Desktop for Mac.
  - Intel® Graphics Virtualization Technology (Intel® GVT) - through, starting from 4th generation Intel Core (TM) processors with Intel processor graphics(Broadwell and newer). It can be used to virtualize the GPU for multiple guest virtual machines, effectively providing near-native graphics performance in the virtual machine and still letting your host use the virtualized GPU normally.
  - Apple Hypervisor - party kernel extensions. Hypervisor provides C APIs so you can interact with virtualization technologies in user space, without writing kernel extensions (KEXTs). As a result, the apps you create using this framework are suitable for distribution on the [Mac App Store](https://www.appstore.com/).
  - Apple Virtualization Framework - level APIs for creating and managing virtual machines on Apple silicon and Intel-based Mac computers. This framework is used to boot and run a Linux-based operating system in a custom environment that you define. It also supports the [Virtio specification](https://www.redhat.com/en/virtio-networking-series), which defines standard interfaces for many device types, including network, socket, serial port, storage, entropy, and memory-balloon devices.
  - Apple Paravirtualized Graphics Framework - accelerated graphics for macOS running in a virtual machine, hereafter known as the guest. The operating system provides a graphics driver that runs inside the guest, communicating with the framework in the host operating system to take advantage of Metal-accelerated graphics.
  - Cloud Hypervisor - lang.org/) and is based on the [rust-vmm](https://github.com/rust-vmm) crates.
  - Xen
  - Ganeti
  - Packer
  - Vagrant - to-use workflow and focus on automation, Vagrant lowers development environment setup time, increases production parity, and makes the "works on my machine" excuse a relic of the past. It provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.
  - VMware vSphere Hypervisor - metal hypervisor that virtualizes servers; allowing you to consolidate your applications while saving time and money managing your IT infrastructure.
  - VMware Workstation
  - Hyper-V
Parallel Computing Learning Resources
- viii. Linear Regression
  - Parallel Computing - level]https://en.wikipedia.org/wiki/Bit-level_parallelism), [instruction-level](https://en.wikipedia.org/wiki/Instruction-level_parallelism), [data](https://en.wikipedia.org/wiki/Data_parallelism), and [task parallelism](https://en.wikipedia.org/wiki/Task_parallelism).
  - Accelerated Computing - Training | NVIDIA Developer
  - Fundamentals of Accelerated Computing with CUDA Python Course | NVIDIA
  - Top Parallel Computing Courses Online | Coursera
  - Top Parallel Computing Courses Online | Udemy
  - Scientific Computing Masterclass: Parallel and Distributed
  - Learn Parallel Computing in Python | Udemy
  - GPU computing in Vulkan | Udemy
  - High Performance Computing Courses | Udacity
  - Parallel Computing Courses | Stanford Online
  - Parallel Computing with CUDA | Pluralsight
  - HPC Architecture and System Design | Intel
Parallel Computing Tools, Libraries, and Frameworks
- viii. Linear Regression
  - MATLAB Parallel Server™
  - Statistics and Machine Learning Toolbox™
  - OpenMP - platform shared-memory parallel programming in C/C++ and Fortran. The OpenMP API defines a portable, scalable model with a simple and flexible interface for developing parallel applications on platforms from the desktop to the supercomputer.
  - CUDA®
  - Message Passing Interface (MPI) - passing standard designed to function on parallel computing architectures.
  - Slurm - source workload manager designed specifically to satisfy the demanding needs of high performance computing.
  - AWS ParallelCluster - supported open source cluster management tool that makes it easy for you to deploy and manage High Performance Computing (HPC) clusters on AWS. ParallelCluster uses a simple text file to model and provision all the resources needed for your HPC applications in an automated and secure manner.
  - Numba - aware optimizing compiler for Python sponsored by Anaconda, Inc. It uses the LLVM compiler project to generate machine code from Python syntax. Numba can compile a large subset of numerically-focused Python, including many NumPy functions. Additionally, Numba has support for automatic parallelization of loops, generation of GPU-accelerated code, and creation of ufuncs and C callbacks.
  - Chainer - based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using [CuPy](https://github.com/cupy/cupy) for high performance training and inference.
  - cuML - learn.
  - Apache Flume
  - Apache HBase™ - source, NoSQL, distributed big data store. It enables random, strictly consistent, real-time access to petabytes of data. HBase is very effective for handling large, sparse datasets. HBase serves as a direct input and output to the Apache MapReduce framework for Hadoop, and works with Apache Phoenix to enable SQL-like queries over HBase tables.
  - Hadoop Distributed File System (HDFS) - yarn/hadoop-yarn-site/YARN.html).
  - Apache Arrow - independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs.
  - Apache Spark™ - scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
  - Apache PredictionIO
  - Microsoft Project Bonsai - code AI platform that speeds AI-powered automation development and part of the Autonomous Systems suite from Microsoft. Bonsai is used to build AI components that can provide operator guidance or make independent decisions to optimize process variables, improve production efficiency, and reduce downtime.
  - Cluster Manager for Apache Kafka(CMAK)
  - BigDL
  - Apache Beam - specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific Languages (DSLs).
  - Jupyter Notebook - source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Jupyter is used widely in industries that do data cleaning and transformation, numerical simulation, statistical modeling, data visualization, data science, and machine learning.
  - Neo4j - strength graph database that combines native graph storage, advanced security, scalable speed-optimized architecture, and ACID compliance to ensure predictability and integrity of relationship-based queries.
  - ElasticSearch - capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java.
  - Logstash
  - Kibana
  - Trino - us/azure/architecture/data-guide/relational-data/etl), allow them all to use standard SQL statement, and work with numerous data sources and targets all in the same system.
  - Redis(REmote DIctionary Server) - memory data structure store, used as a database, cache, and message broker. It provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams.
  - Apache OpenNLP - source library for a machine learning based toolkit used in the processing of natural language text. It features an API for use cases like [Named Entity Recognition](https://en.wikipedia.org/wiki/Named-entity_recognition), [Sentence Detection](), [POS(Part-Of-Speech) tagging](https://en.wikipedia.org/wiki/Part-of-speech_tagging), [Tokenization](https://en.wikipedia.org/wiki/Tokenization_(data_security)) [Feature extraction](https://en.wikipedia.org/wiki/Feature_extraction), [Chunking](https://en.wikipedia.org/wiki/Chunking_(psychology)), [Parsing](https://en.wikipedia.org/wiki/Parsing), and [Coreference resolution](https://en.wikipedia.org/wiki/Coreference).
  - Open Neural Network Exchange(ONNX) - in operators and standard data types.
  - AutoGluon - accuracy deep learning models on tabular, image, and text data.
  - Portable Batch System (PBS) Pro
Types of Accelerators

Programming Languages

C++ 17 C 10 Python 7 Scala 3 Rust 2 C# 2 Verilog 2 Nim 1 Shell 1 Java 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

amx-guide

Linear Algebra Learning Resources

LLVM Learning Resources

LLVM Tools, Libraries and Frameworks

ML Frameworks, Libraries, and Tools

viii. Linear Regression

OpenCL Learning Resources

viii. Linear Regression

OpenCL Tools, Libraries and Frameworks

viii. Linear Regression

Other Linear Topics

i. Basis

iii. Dimension and Basis for Vector Spaces

ii. Matrix representations of linear transformations

iv. Row space, columns space, and rank of a matrix

vi. Determinants

vii. Eigenvalues and eigenvectors

viii. Linear Regression

Parallel Computing Learning Resources

viii. Linear Regression

Parallel Computing Tools, Libraries, and Frameworks

viii. Linear Regression

Types of Accelerators