gpu-guide
Graphics Processing Unit (GPU) Architecture Guide
https://github.com/mikeroyal/gpu-guide
Last synced: 4 days ago
JSON representation
-
MATLAB Learning Resources
- Getting Started with MATLAB
- MathWorks Certification Program
- Apache Spark Basics | MATLAB & Simulink
- MATLAB Hadoop and Spark | MATLAB & Simulink
- MATLAB Online Courses from Udemy
- MATLAB Online Courses from Coursera
- MATLAB Online Courses from edX
- Building a MATLAB GUI
- MATLAB Style Guidelines 2.0
- Setting Up Git Source Control with MATLAB & Simulink
- Pull, Push and Fetch Files with Git with MATLAB & Simulink
- Create New Repository with MATLAB & Simulink
- PRMLT
- MATLAB and Simulink Training from MATLAB Academy
-
MATLAB Tools, Libraries, Frameworks
- MATLAB and Simulink Services & Applications List
- MATLAB in the Cloud - cloud) including [AWS](https://aws.amazon.com/) and [Azure](https://azure.microsoft.com/).
- Simulink - Based Design. It supports simulation, automatic code generation, and continuous testing of embedded systems.
- Simulink Online™
- MATLAB Drive™
- MATLAB Schemer
- SoC Blockset™
- Wireless HDL Toolbox™ - verified, hardware-ready Simulink® blocks and subsystems for developing 5G, LTE, and custom OFDM-based wireless communication applications. It includes reference applications, IP blocks, and gateways between frame and sample-based processing.
- ThingSpeak™ - of-concept IoT systems that require analytics.
- SEA-MAT
- Gramm - level interface to produce publication-quality plots of complex data with varied statistical visualizations. Gramm is inspired by R's ggplot2 library.
- hctsa - series analysis using Matlab.
- YALMIP
- GNU Octave - level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It also provides extensive graphics capabilities for data visualization and manipulation.
- MATLAB Online™
- Plotly
-
Metal Learning Resources
- Apple Developer Documentation
- Metal Shading Language Specification
- Metal Sample code
- Metal plugin for TensorFlow
- Metal Developer discussions
- MetalKit
- Using Metal Feature Set Tables
- Metal Performance Shaders
- Optimizing Performance with the GPU Counters Instrument
- Enabling Frame Capture
- Reducing the Memory Footprint of Metal Apps
- Metal Developer Tools for Windows
-
Metal Tools, Libraries, and Frameworks
- Apple Foundation Framework
- Apple Core Animation Framework
- Apple Core Graphics Framework - level, lightweight 2D rendering with unmatched output fidelity.
- Paravirtualized Graphics Framework - accelerated graphics for macOS running in a virtual machine, hereafter known as the guest. The macOS operating system provides a graphics driver that runs inside the guest, communicating with the framework in the host operating system to take advantage of Metal-accelerated graphics.
-
ML Frameworks, Libraries, and Tools
- Amazon SageMaker
- Azure Databricks - based big data analytics service designed for data science and data engineering. Azure Databricks, sets up your Apache Spark environment in minutes, autoscale, and collaborate on shared projects in an interactive workspace. Azure Databricks supports Python, Scala, R, Java, and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch, and scikit-learn.
- Apple CoreML - tune models, all on the user's device. A model is the result of applying a machine learning algorithm to a set of training data. You use a model to make predictions based on new input data.
- Tensorflow_macOS - optimized version of TensorFlow and TensorFlow Addons for macOS 11.0+ accelerated using Apple's ML Compute framework.
- Anaconda
- PlaidML
- OpenCV - time computer vision applications. The C++, Python, and Java interfaces support Linux, MacOS, Windows, iOS, and Android.
- Scikit-Learn
- Caffe
- Theano - dimensional arrays efficiently including tight integration with NumPy.
- nGraph - of-use to AI developers.
- Apache Spark Connector for SQL Server and Azure SQL - performance connector that enables you to use transactional data in big data analytics and persists results for ad-hoc queries or reporting. The connector allows you to use any SQL database, on-premises or in the cloud, as an input data source or output data sink for Spark jobs.
- Eclipse Deeplearning4J (DL4J) - based(Scala, Kotlin, Clojure, and Groovy) deep learning application. This means starting with the raw data, loading and preprocessing it from wherever and whatever format it is in to building and tuning a wide variety of simple and complex deep learning networks.
- TensorFlow - to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.
-
OpenCL Learning Resources
- Open Computing Language (OpenCL) - to-parallel-computing-zNrIS) of heterogeneous platforms consisting of CPUs, GPUs, and other hardware accelerators found in supercomputers, cloud servers, personal computers, mobile devices and embedded platforms.
- OpenCL | NVIDIA Developer
- Introduction to OpenCL on FPGAs Course | Coursera
- Compiling OpenCL Kernel to FPGAs Course | Coursera
- OpenCL Tutorials - StreamHPC
- Introduction to Intel® OpenCL Tools
- Khronos Group | GitHub
- OpenCL | GitHub
-
OpenCL Tools, Libraries and Frameworks
- GPUVerify
- OpenCL ICD Loader
- clBLAS
- clFFT
- clSPARSE
- clRNG
- CLsmith - core environment, OpenCL. Its primary feature is the generation of random OpenCL kernels, exercising many features of the language. It also brings a novel idea of applying EMI, via dead-code injection.
- Oclgrind - races and barrier divergence, collecting instruction histograms, and for interactive OpenCL kernel debugging. The simulator is built on an interpreter for LLVM IR.
- NVIDIA® Nsight™ Visual Studio Edition
- Radeon™ GPU Profiler
- Radeon™ GPU Analyzer
- AMD Radeon ProRender - based rendering engine that enables creative professionals to produce stunningly photorealistic images on virtually any GPU, any CPU, and any OS in over a dozen leading digital content creation and CAD applications.
- Intel® SDK For OpenCL™ Applications - intensive workloads. Customize heterogeneous compute applications and accelerate performance with kernel-based programming.
- NVIDIA cuDNN - accelerated library of primitives for [deep neural networks](https://developer.nvidia.com/deep-learning). cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN accelerates widely used deep learning frameworks, including [Caffe2](https://caffe2.ai/), [Chainer](https://chainer.org/), [Keras](https://keras.io/), [MATLAB](https://www.mathworks.com/solutions/deep-learning.html), [MxNet](https://mxnet.incubator.apache.org/), [PyTorch](https://pytorch.org/), and [TensorFlow](https://www.tensorflow.org/).
- NVIDIA Container Toolkit - container) and utilities to automatically configure containers to leverage NVIDIA GPUs.
- RenderDoc - alone graphics debugger that allows quick and easy single-frame capture and detailed introspection of any application using Vulkan, D3D11, OpenGL & OpenGL ES or D3D12 across Windows, Linux, Android, Stadia, or Nintendo Switch™.
- NVIDIA NGC - optimized software for deep learning, machine learning, and high-performance computing (HPC) workloads.
-
OpenGL Learning Resources
- OpenGL ES™
- WebGL™ - platform, royalty-free web standard for a low-level 3D graphics API based on OpenGL ES, exposed to JavaScript via the HTML5 Canvas element.
- Top OpenGL Courses Online | Udemy
- Getting Started with OpenGL
- WebGL Public Wiki
- WebGL 2.0 Specification
- OpenGL Online Training Courses | LinkedIn Learning
- Top OpenGL Courses Online | Coursera
- OpenGL Reference Cards
-
OpenGL Tools, Libraries, and Frameworks
- BuGLe - like OSes. BuGLe combines a graphical OpenGL debugger with a selection of filters on the OpenGL command stream. The debugger allows viewing of state, textures, framebuffers and shaders, while the filters allow for logging, error checking, video capture and more.
- gDEBugger - featured and free debugger and profiler representing the state-of-the-art in OpenGL and OpenGL ES debugging and profiling on Windows and Linux.
- KTX
- Equalizer
- GLee - platform extension loading library that takes the burden off your application. GLee makes it easy to check for OpenGL extension and core version availability, automatically setting up the entry points with no effort on your part.
- GLEW - source cross-platform extension loading library with thread-safe support for multiple rendering contexts and automatic code generation capability. GLEW provides easy-to-use and efficient methods for checking OpenGL extensions and core functionality.
- libktx
- OpenSceneGraph - level 3D graphics toolkit exposing OpenGL's capabilities while providing many capabilities of its own. OpenSceneGraph boasts a large user community and has been employed for visual simulation, games, virtual reality, scientific visualization, and modeling.
- Mesa 3D Graphics Library - source implementation of the OpenGL specification. A system for rendering interactive 3D graphics. Mesa ties into several other open-source projects: the [Direct Rendering Infrastructure](https://dri.freedesktop.org/), [X.org](https://x.org/), and [Wayland](https://wayland.freedesktop.org/) to provide OpenGL support on Linux, FreeBSD, and other operating systems.
- GLUS - source C library, which provides a hardware and operating system abstraction plus many functions usually needed for graphics programming using OpenGL, OpenGL ES or OpenVG.
- OpenGL Mathematics (GLM)
-
Parallel Computing Learning Resources
- Parallel Computing - level]https://en.wikipedia.org/wiki/Bit-level_parallelism), [instruction-level](https://en.wikipedia.org/wiki/Instruction-level_parallelism), [data](https://en.wikipedia.org/wiki/Data_parallelism), and [task parallelism](https://en.wikipedia.org/wiki/Task_parallelism).
- Accelerated Computing - Training | NVIDIA Developer
- Fundamentals of Accelerated Computing with CUDA Python Course | NVIDIA
- Top Parallel Computing Courses Online | Udemy
- Scientific Computing Masterclass: Parallel and Distributed
- Learn Parallel Computing in Python | Udemy
- GPU computing in Vulkan | Udemy
- High Performance Computing Courses | Udacity
- Parallel Computing Courses | Stanford Online
- Parallel Computing with CUDA | Pluralsight
- HPC Architecture and System Design | Intel
- Top Parallel Computing Courses Online | Coursera
-
Parallel Computing Tools, Libraries, and Frameworks
- Parallel Computing Toolbox™ - intensive problems using multicore processors, GPUs, and computer clusters. High-level constructs such as parallel for-loops, special array types, and parallelized numerical algorithms enable you to parallelize MATLAB® applications without CUDA or MPI programming. The toolbox lets you use parallel-enabled functions in MATLAB and other toolboxes. You can use the toolbox with Simulink® to run multiple simulations of a model in parallel. Programs and models can run in both interactive and batch modes.
- Statistics and Machine Learning Toolbox™
- OpenMP - platform shared-memory parallel programming in C/C++ and Fortran. The OpenMP API defines a portable, scalable model with a simple and flexible interface for developing parallel applications on platforms from the desktop to the supercomputer.
- CUDA®
- Message Passing Interface (MPI) - passing standard designed to function on parallel computing architectures.
- Slurm - source workload manager designed specifically to satisfy the demanding needs of high performance computing.
- AWS ParallelCluster - supported open source cluster management tool that makes it easy for you to deploy and manage High Performance Computing (HPC) clusters on AWS. ParallelCluster uses a simple text file to model and provision all the resources needed for your HPC applications in an automated and secure manner.
- Numba - aware optimizing compiler for Python sponsored by Anaconda, Inc. It uses the LLVM compiler project to generate machine code from Python syntax. Numba can compile a large subset of numerically-focused Python, including many NumPy functions. Additionally, Numba has support for automatic parallelization of loops, generation of GPU-accelerated code, and creation of ufuncs and C callbacks.
- Chainer - based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using [CuPy](https://github.com/cupy/cupy) for high performance training and inference.
- cuML - learn.
- Apache Flume
- XGBoost
- Apache Mesos
- Apache HBase™ - source, NoSQL, distributed big data store. It enables random, strictly consistent, real-time access to petabytes of data. HBase is very effective for handling large, sparse datasets. HBase serves as a direct input and output to the Apache MapReduce framework for Hadoop, and works with Apache Phoenix to enable SQL-like queries over HBase tables.
- Hadoop Distributed File System (HDFS) - yarn/hadoop-yarn-site/YARN.html).
- Apache Arrow - independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs.
- Apache Spark™ - scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
- Apache PredictionIO
- Microsoft Project Bonsai - code AI platform that speeds AI-powered automation development and part of the Autonomous Systems suite from Microsoft. Bonsai is used to build AI components that can provide operator guidance or make independent decisions to optimize process variables, improve production efficiency, and reduce downtime.
- Cluster Manager for Apache Kafka(CMAK)
- BigDL
- Apache Beam - specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific Languages (DSLs).
- Jupyter Notebook - source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Jupyter is used widely in industries that do data cleaning and transformation, numerical simulation, statistical modeling, data visualization, data science, and machine learning.
- Neo4j - strength graph database that combines native graph storage, advanced security, scalable speed-optimized architecture, and ACID compliance to ensure predictability and integrity of relationship-based queries.
- ElasticSearch - capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java.
- Logstash
- Kibana
- Trino - us/azure/architecture/data-guide/relational-data/etl), allow them all to use standard SQL statement, and work with numerous data sources and targets all in the same system.
- Redis(REmote DIctionary Server) - memory data structure store, used as a database, cache, and message broker. It provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams.
- Apache OpenNLP - source library for a machine learning based toolkit used in the processing of natural language text. It features an API for use cases like [Named Entity Recognition](https://en.wikipedia.org/wiki/Named-entity_recognition), [Sentence Detection](), [POS(Part-Of-Speech) tagging](https://en.wikipedia.org/wiki/Part-of-speech_tagging), [Tokenization](https://en.wikipedia.org/wiki/Tokenization_(data_security)) [Feature extraction](https://en.wikipedia.org/wiki/Feature_extraction), [Chunking](https://en.wikipedia.org/wiki/Chunking_(psychology)), [Parsing](https://en.wikipedia.org/wiki/Parsing), and [Coreference resolution](https://en.wikipedia.org/wiki/Coreference).
- Open Neural Network Exchange(ONNX) - in operators and standard data types.
- Back to the Top
- AutoGluon - accuracy deep learning models on tabular, image, and text data.
- Microsoft MPI (MS-MPI)
-
Performance Benchmarks
- Geekbench 5 - platform benchmark that measures your system's performance with the press of a button.
- Phoronix Test Suite
- UNIGINE Superposition
-
Python Frameworks and Tools
- Python Package Index (PyPI)
- PyCharm
- Python Tools for Visual Studio(PTVS)
- Pylance
- Pyright
- Django - level Python Web framework that encourages rapid development and clean, pragmatic design.
- AWS Chalice
- HTTPie
- Pipenv
- Python Fire
- Bottle - framework for Python. It is distributed as a single file module and has no dependencies other than the [Python Standard Library](https://docs.python.org/library/).
- Falcon - performance Python web framework for building large-scale app backends and microservices with support for MongoDB, Pluggable Applications and autogenerated Admin.
- Neural Network Intelligence(NNI)
- Luigi - in.
- Locust
- spaCy
- Pillow
- IPython
- Pandas
- PuLP
- Matplotlib - quality figures in a variety of hardcopy formats and interactive environments across platforms.
- Sanic
- GraphLab Create - scale, high-performance machine learning models.
- Sentry
- Web2py - source web application framework written in Python allowing allows web developers to program dynamic web content. One web2py instance can run multiple web sites using different databases.
-
Python Learning Resources
- CheckiO
- Getting Started with Python in Visual Studio Code
- Google's Python Style Guide
- The Python Open Source Computer Science Degree by Forrest Knight
- Intro to Python for Data Science
- Intro to Python by W3schools
- Codecademy's Python 3 course
- Learn Python with Online Courses and Classes from edX
- Python Courses Online from Coursera
- Real Python
-
R Learning Resources
- R
- An Introduction to R
- Google's R Style Guide
- Running R at Scale on Google Compute Engine
- Running R on AWS
- Learn R by Codecademy
- Learn R Programming with Online Courses and Lessons by edX
- Learn R For Data Science by Udacity
- Running R at Scale on Google Compute Engine
- R developer's guide to Azure
- RStudio Server Pro for AWS
Categories
3D Graphics and Design Tools
51
Audio/Video Tools and Equipment
35
Parallel Computing Tools, Libraries, and Frameworks
34
Deep Learning Learning Resources
29
C/C++ Tools and Frameworks
28
C/C++ Learning Resources
27
CUDA Tools Libraries, and Frameworks
26
Python Frameworks and Tools
25
Julia Tools, Libraries and Frameworks
24
Deep Learning Tools, Libraries, and Frameworks
21
3D Graphics and Design Learning Resources
19
R Tools, Libraries, and Frameworks
19
OpenCL Tools, Libraries and Frameworks
17
Game Development Tools, Libraries, and Frameworks
16
MATLAB Tools, Libraries, Frameworks
16
Vulkan Tools, Libraries, and Frameworks
16
MATLAB Learning Resources
16
ML Frameworks, Libraries, and Tools
14
Metal Learning Resources
12
Parallel Computing Learning Resources
12
R Learning Resources
12
Computer Vision Learning Resources
11
OpenGL Tools, Libraries, and Frameworks
11
Julia Learning Resources
11
Python Learning Resources
10
Augmented Reality (AR) & Virtual Reality (VR)
10
Core ML Learning Resources
10
DirectX Learning Resources
10
OpenGL Learning Resources
9
Core ML Tools, Libraries, and Frameworks
9
DirectX Tools, Libraries, and Frameworks
9
OpenCL Learning Resources
8
Audio/Video Learning Resources
8
Game Development Learning Resources
8
Learning Resources for ML
7
Vulkan Learning Resources
7
Game Emulators
7
Game Engines
7
Game Streaming
5
CUDA Learning Resources
4
Metal Tools, Libraries, and Frameworks
4
Performance Benchmarks
3
Computer Vision Tools, Libraries, and Frameworks
3
Steam
2
Contribute
1
License
1
Apple Arcade
1
Sub Categories
Keywords
python
14
cpp
10
gpu
8
deep-learning
8
cuda
8
vulkan
7
machine-learning
6
julia
5
graphics
4
gamedev
4
nvidia
4
game-engine
4
game-development
4
data-science
4
neural-network
3
windows
3
pytorch
3
tensorflow
3
c
3
cplusplus
3
cxx14
3
matlab
3
cpp14
3
neural-networks
3
cross-platform
3
cpp11
3
dotnet
3
cpp17
2
algorithms
2
cpp20
2
azure
2
cxx
2
azure-sdk
2
cxx11
2
docker
2
iot
2
linux
2
cloud
2
metal
2
machine-learning-algorithms
2
vulkan-api
2
android
2
c-plus-plus
2
ios
2
data-visualization
2
numpy
2
nlp
2
java
2
rest
2
developer-tools
2