gpu-guide
Graphics Processing Unit (GPU) Architecture Guide
https://github.com/mikeroyal/gpu-guide
Last synced: 4 days ago
JSON representation
-
Core ML Learning Resources
-
Core ML Tools, Libraries, and Frameworks
- Core ML tools
- Create ML
- Apple Vision
- SwiftUI
- UIKit - Touch and other types of input to your app, and the main run loop needed to manage interactions among the user, the system, and your app.
- AppKit
- Instruments - analysis and testing tool that’s part of the Xcode tool set. It’s designed to help you profile your iOS, watchOS, tvOS, and macOS apps, processes, and devices in order to better understand and optimize their behavior and performance.
- Cocoapods - C used in Xcode projects by specifying the dependencies for your project in a simple text file. CocoaPods then recursively resolves dependencies between libraries, fetches source code for all dependencies, and creates and maintains an Xcode workspace to build your project.
- AppCode - fixes to resolve them automatically. AppCode provides lots of code inspections for Objective-C, Swift, C/C++, and a number of code inspections for other supported languages.
-
CUDA Learning Resources
-
CUDA Tools Libraries, and Frameworks
- CUDA Toolkit - accelerated applications. The CUDA Toolkit allows you can develop, optimize, and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library to build and deploy your application on major architectures including x86, Arm and POWER.
- CUDA-X HPC - X HPC includes highly tuned kernels essential for high-performance computing (HPC).
- CUTLASS - performance matrix-multiplication (GEMM) at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS.
- CUB
- Tensorman
- CuPy - compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.
- CatBoost
- cuDF - like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming.
- ArrayFire - purpose library that simplifies the process of developing software that targets parallel and massively-parallel architectures including CPUs, GPUs, and other hardware acceleration devices.
- Thrust - level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs.
- AresDB - powered real-time analytics storage and query engine. It features low query latency, high data freshness and highly efficient in-memory and on disk storage management.
- Arraymancer - dimensional array) project in Nim. The main focus is providing a fast and ergonomic CPU, Cuda and OpenCL ndarray library on which to build a scientific computing ecosystem.
- Kintinuous - time dense visual SLAM system capable of producing high quality globally consistent point and mesh reconstructions over hundreds of metres in real-time with only a low-cost commodity RGB-D sensor.
- Fuzzy logic - tree processing and better integration with rules-based programming.
- ResearchGate
- Support Vector Machine (SVM) - group classification problems.
- OpenClipArt
- Convolutional Neural Networks (R-CNN)
- CS231n
- Slideteam
- Multilayer Perceptrons (MLPs) - layer neural networks composed of multiple layers of [perceptrons](https://en.wikipedia.org/wiki/Perceptron) with a threshold activation.
- wikimedia
- Decision trees - structured models for classification and regression.
- CMU
- Naive Bayes - theorem.html) with strong independence assumptions between the features.
- mathisfun
-
Deep Learning Learning Resources
- Top Deep Learning Courses Online | Udemy
- Learn Deep Learning with Online Courses and Lessons | edX
- Deep Learning Online Course Nanodegree | Udacity
- Data Science: Deep Learning and Neural Networks in Python | Udemy
- Understanding Machine Learning with Python | Pluralsight
- How to Think About Machine Learning Algorithms | Pluralsight
- Deep Learning Courses | Stanford Online
- Deep Learning - UW Professional & Continuing Education
- Deep Learning Online Courses | Harvard University
- Artificial Intelligence Expert Course: Platinum Edition | Udemy
- Learn Artificial Intelligence with Online Courses and Lessons | edX
- Artificial Intelligence Nanodegree program
- Artificial Intelligence (AI) Online Courses | Udacity
- Intro to Artificial Intelligence Course | Udacity
- Edge AI for IoT Developers Course | Udacity
- Expert Systems and Applied Artificial Intelligence
- Introduction to Microsoft Project Bonsai
- Autonomous Maritime Systems Training | AMC Search
- Top Autonomous Cars Courses Online | Udemy
- Applied Control Systems 1: autonomous cars: Math + PID + MPC | Udemy
- Learn Autonomous Robotics with Online Courses and Lessons | edX
- Autonomous Systems Online Courses & Programs | Udacity
- Autonomous Systems MOOC and Free Online Courses | MOOC List
- Robotics and Autonomous Systems Graduate Program | Standford Online
- Mobile Autonomous Systems Laboratory | MIT OpenCourseWare
- Autonomous Systems - Microsoft AI
- Machine Learning for Everyone Courses | DataCamp
- Reasoning: Goal Trees and Rule-Based Expert Systems | MIT OpenCourseWare
- Deep Learning Online Courses | NVIDIA
-
Deep Learning Tools, Libraries, and Frameworks
- AMD FidelityFX Super Resolution (FSR) - quality solution for producing high resolution frames from lower resolution inputs. It uses a collection of cutting-edge Deep Learning algorithms with a particular emphasis on creating high-quality edges, giving large performance improvements compared to rendering at native resolution directly. FSR enables “practical performance” for costly render operations, such as hardware ray tracing for the AMD RDNA™ and AMD RDNA™ 2 architectures.
- Intel Xe Super Sampling (XeSS) - cores to run XeSS. The GPUs will have Xe Matrix eXtenstions matrix (XMX) engines for hardware-accelerated AI processing. XeSS will be able to run on devices without XMX, including integrated graphics, though, the performance of XeSS will be lower on non-Intel graphics cards because it will be powered by [DP4a instruction](https://www.intel.com/content/dam/www/public/us/en/documents/reference-guides/11th-gen-quick-reference-guide.pdf).
- Deep Learning Toolbox™ - term memory (LSTM) networks to perform classification and regression on image, time-series, and text data. You can build network architectures such as generative adversarial networks (GANs) and Siamese networks using automatic differentiation, custom training loops, and shared weights. With the Deep Network Designer app, you can design, analyze, and train networks graphically. It can exchange models with TensorFlow™ and PyTorch through the ONNX format and import models from TensorFlow-Keras and Caffe. The toolbox supports transfer learning with DarkNet-53, ResNet-50, NASNet, SqueezeNet and many other pretrained models.
- Reinforcement Learning Toolbox™ - making algorithms for complex applications such as resource allocation, robotics, and autonomous systems.
- Deep Learning HDL Toolbox™ - built bitstreams for running a variety of deep learning networks on supported Xilinx® and Intel® FPGA and SoC devices. Profiling and estimation tools let you customize a deep learning network by exploring design, performance, and resource utilization tradeoffs.
- LIBSVM - SVC, nu-SVC), regression (epsilon-SVR, nu-SVR) and distribution estimation (one-class SVM). It supports multi-class classification.
- Microsoft AirSim - source, cross platform, and supports [software-in-the-loop simulation](https://www.mathworks.com/help///ecoder/software-in-the-loop-sil-simulation.html) with popular flight controllers such as PX4 & ArduPilot and [hardware-in-loop](https://www.ni.com/en-us/innovations/white-papers/17/what-is-hardware-in-the-loop-.html) with PX4 for physically and visually realistic simulations. It is developed as an Unreal plugin that can simply be dropped into any Unreal environment. AirSim is being developed as a platform for AI research to experiment with deep learning, computer vision and reinforcement learning algorithms for autonomous vehicles.
- CARLA - source simulator for autonomous driving research. CARLA has been developed from the ground up to support development, training, and validation of autonomous driving systems. In addition to open-source code and protocols, CARLA provides open digital assets (urban layouts, buildings, vehicles) that were created for this purpose and can be used freely.
- ROS/ROS2 bridge for CARLA(package) - way communication between ROS and CARLA. The information from the CARLA server is translated to ROS topics. In the same way, the messages sent between nodes in ROS get translated to commands to be applied in CARLA.
- ROS Toolbox
- Robotics Toolbox™ - holonomic vehicle. The Toolbox also including a detailed Simulink model for a quadrotor flying robot.
- Image Processing Toolbox™ - standard algorithms and workflow apps for image processing, analysis, visualization, and algorithm development. You can perform image segmentation, image enhancement, noise reduction, geometric transformations, image registration, and 3D image processing.
- Computer Vision Toolbox™
- Model Predictive Control Toolbox™ - loop simulations, you can evaluate controller performance.
- Predictive Maintenance Toolbox™ - based and model-based techniques, including statistical, spectral, and time-series analysis.
- Vision HDL Toolbox™ - streaming algorithms for the design and implementation of vision systems on FPGAs and ASICs. It provides a design framework that supports a diverse set of interface types, frame sizes, and frame rates. The image processing, video, and computer vision algorithms in the toolbox use an architecture appropriate for HDL implementations.
- Automated Driving Toolbox™ - eye-view plot and scope for sensor coverage, detections and tracks, and displays for video, lidar, and maps. The toolbox lets you import and work with HERE HD Live Map data and OpenDRIVE® road networks. It also provides reference application examples for common ADAS and automated driving features, including FCW, AEB, ACC, LKA, and parking valet. The toolbox supports C/C++ code generation for rapid prototyping and HIL testing, with support for sensor fusion, tracking, path planning, and vehicle controller algorithms.
- UAV Toolbox
- Navigation Toolbox™ - based path planners, as well as metrics for validating and comparing paths. You can create 2D and 3D map representations, generate maps using SLAM algorithms, and interactively visualize and debug map generation with the SLAM map builder app.
- Lidar Toolbox™ - camera cross calibration for workflows that combine computer vision and lidar processing.
- Mapping Toolbox™
-
DirectX Learning Resources
- Microsoft DirectX® - level API that handles tasks related to multimedia for game programming and video on Microsoft platforms(Windows & Xbox).
- Getting Started with DirectX 12 Ultimate
- Getting Started with the DirectX 12 Agility SDK
- DirectX— Feature Level 12_2
- DirectX 12 Technology | NVIDIA
- AMD DirectX® 12 (DX12) Technology | AMD
- Top Microsoft DirectX Courses Online | Udemy
- DirectX - Learn Microsoft DirectX from Scratch Course | Udemy
- DirectX 11 Programming Course | Udemy
- DirectX 12 and Graphics Education | YouTube
-
DirectX Tools, Libraries, and Frameworks
- Visual Studio - rich application that can be used for many aspects of software development. Visual Studio makes it easy to edit, debug, build, and publish your app. By using Microsoft software development platforms such as Windows API, Windows Forms, Windows Presentation Foundation, and Windows Store.
- Visual Studio Code
- DirectX-Graphics-Samples
- PIX on Windows
- DirectStorage API - us/2020/07/14/a-closer-look-at-xbox-velocity-architecture/) to Windows. The DirectX API is architected in a way that takes all this into account and maximizes performance throughout the entire pipeline from NVMe drive all the way to the GPU. It does this in several ways: by reducing per-request NVMe overhead, enabling batched many-at-a-time parallel IO requests which can be efficiently fed to the GPU, and giving games finer grain control over when they get notified of IO request completion instead of having to react to every tiny IO completion. The DirectStorage API will be available on [Windows 11](https://www.microsoft.com/en-us/windows/windows-11) PCs with NVMe SSDs, but will also be support in [Windows 10](https://www.microsoft.com/software-download/windows10) version 1909 and newer.
- FNA
- Simple DirectMedia Layer - platform development library designed to provide low level access to audio, keyboard, mouse, joystick, and graphics hardware via OpenGL and Direct3D. It is used by video playback software, emulators, and popular games including Valve's award winning catalog.
- NVRHI (NVIDIA Rendering Hardware Interface)
- RTXMU - RTX Memory Utility SDK
-
Game Development Learning Resources
- Unreal Online Learning - on video courses and guided learning paths.
- Unreal Engine Authorized Training Program
- Unreal Engine for education
- Unreal Engine Training & Simulation
- Unity Certifications
- Game Design Online Courses from Udemy
- Game Design Online Courses from Skillshare
- Learn Game Design with Online Courses and Classes from edX
-
Game Development Tools, Libraries, and Frameworks
- Panda3D - source and free for any purpose, including commercial ventures.
- Source 2 - Life: Alyx.
- AutoDesk 3ds Max
- Houdini
- Open Graphics Library(OpenGL) - accelerated rendering of 2D/3D vector graphics currently developed by the [Khronos Group](https://www.khronos.org/).
- High Level Shading Language(HLSL) - like programmable shaders for the Direct3D pipeline. HLSL was first created with DirectX 9 to set up the programmable 3D pipeline.
- Vulkan - platform graphics and compute API that provides high-efficiency, cross-platform access to modern GPUs used in a wide variety of devices from PCs and consoles to mobile phones and embedded platforms. Vulkan is currently in development by the Khronos consortium.
- Metal - level GPU programming framework used for rendering 2D and 3D graphics on Apple platforms such as iOS, iPadOS, macOS, watchOS and tvOS.
- LibGDX - platform Java game development framework based on OpenGL (ES) that works on Windows, Linux, Mac OS X, Android, your WebGL enabled browser and iOS.
- cocos2d-x - platform framework for building 2d games, interactive books, demos and other graphical applications. It is based on cocos2d-iphone, but instead of using Objective-C, it uses C++. It works on iOS, Android, macOS, Windows and Linux.
- MonoGame - platform games. The spiritual successor to XNA with thousands of titles shipped across desktop, mobile, and console platforms. MonoGame is a fully managed .NET open source game framework without any black boxes.
- HGIG
- Vivox
- MoltenVK
- OpenGL Shading Language(GLSL) - style language, so it covers most of the features a user would expect with such a language. Such as control structures (for-loops, if-else statements, etc) exist in GLSL, including the switch statement.
- Superpowers - time collaborative projects . You can use it solo like a regular offline game maker, or setup a password and let friends join in on your project through their Web browser.
-
Game Emulators
-
Game Engines
- Unity - platform game development platform. Use Unity to build high-quality 3D and 2D games, deploy them across mobile, desktop, VR/AR, consoles or the Web, and connect with loyal and enthusiastic players and customers.
- Unreal Engine 4 - time 3D creation tool. Continuously evolving to serve not only its original purpose as a state-of-the-art game engine, today it gives creators across industries the freedom and control to deliver cutting-edge content, interactive experiences, and immersive virtual worlds.
- Godot Engine - packed, cross-platform game engine to create 2D and 3D games from a unified interface. It provides a comprehensive set of common tools, so that users can focus on making games without having to reinvent the wheel. Games can be exported in one click to a number of platforms, including the major desktop platforms (Linux, Mac OSX, Windows) as well as mobile (Android, iOS) and web-based (HTML5) platforms.
- If you would like to Donate to the Godot Project
- Blender
- If you would like to Donate to the Blender Project
- Unigine - platform game engine designed for development teams (C++/C# programmers, 3D artists) working on interactive 3D apps.
-
Game Streaming
- Geforce NOW
- Moonlight Game Streaming
- Chiaki
- Xbox Project xCloud - based Xbox game-streaming technology **(currently in Beta)**. **Play games like Forza Horizon 4, Halo 5: Guardians, Gears of War 4, Sea of Thieves, Cuphead, Red Dead Redemption 2, and 100+ other games on your mobile device or Chrome web browser**. Microsoft's Xbox Project xCloud does require an [Xbox Game Pass Ultimate](https://www.xbox.com/en-US/xbox-game-pass/cloud-gaming) subscription.
- Amazon Luna
-
Julia Learning Resources
-
Julia Tools, Libraries and Frameworks
- JuliaPro
- Debugger.jl
- Profile (Stdlib)
- Revise.jl - compile.
- JuliaGPU - level syntax and flexible compiler, Julia is well positioned to productively program hardware accelerators like GPUs without sacrificing performance.
- IJulia.jl
- AWS.jl
- CUDA.jl - friendly array abstraction, a compiler for writing CUDA kernels in Julia, and wrappers for various CUDA libraries.
- Nanosoldier.jl
- JuMP.jl - specific modeling language for [mathematical optimization](https://en.wikipedia.org/wiki/Mathematical_optimization) embedded in Julia.
- Optim.jl
- RCall.jl
- PyCall.jl
- MXNet.jl - of-art deep learning to Julia.
- Distributions.jl
- Flux.jl - Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support.
- IRTools.jl
- Juno
- XLA.jl
- Julia for VSCode
- JavaCall.jl
- Knet
- DataFrames.jl
- Cassette.jl - in-time (JIT) compilation cycle, enabling post hoc analysis and modification of "Cassette-unaware" Julia programs without requiring manual source annotation or refactoring of the target code.
-
Learning Resources for ML
- Machine Learning by Stanford University from Coursera
- Machine Learning Scholarship Program for Microsoft Azure from Udacity
- Scheduling Jupyter notebooks on Amazon SageMaker ephemeral instances
- Machine Learning Courses Online from Udemy
- Learn Machine Learning with Online Courses and Classes from edX
- Learning Machine learning and artificial intelligence from Google Cloud Training
- AWS Training and Certification for Machine Learning (ML) Courses
-
MATLAB Learning Resources
Categories
3D Graphics and Design Tools
51
Audio/Video Tools and Equipment
35
Parallel Computing Tools, Libraries, and Frameworks
34
Deep Learning Learning Resources
29
C/C++ Tools and Frameworks
28
C/C++ Learning Resources
27
CUDA Tools Libraries, and Frameworks
26
Python Frameworks and Tools
25
Julia Tools, Libraries and Frameworks
24
Deep Learning Tools, Libraries, and Frameworks
21
3D Graphics and Design Learning Resources
19
R Tools, Libraries, and Frameworks
19
OpenCL Tools, Libraries and Frameworks
17
Game Development Tools, Libraries, and Frameworks
16
MATLAB Tools, Libraries, Frameworks
16
Vulkan Tools, Libraries, and Frameworks
16
MATLAB Learning Resources
16
ML Frameworks, Libraries, and Tools
14
Metal Learning Resources
12
Parallel Computing Learning Resources
12
R Learning Resources
12
Computer Vision Learning Resources
11
OpenGL Tools, Libraries, and Frameworks
11
Julia Learning Resources
11
Python Learning Resources
10
Augmented Reality (AR) & Virtual Reality (VR)
10
Core ML Learning Resources
10
DirectX Learning Resources
10
OpenGL Learning Resources
9
Core ML Tools, Libraries, and Frameworks
9
DirectX Tools, Libraries, and Frameworks
9
OpenCL Learning Resources
8
Audio/Video Learning Resources
8
Game Development Learning Resources
8
Learning Resources for ML
7
Vulkan Learning Resources
7
Game Emulators
7
Game Engines
7
Game Streaming
5
CUDA Learning Resources
4
Metal Tools, Libraries, and Frameworks
4
Performance Benchmarks
3
Computer Vision Tools, Libraries, and Frameworks
3
Steam
2
Contribute
1
License
1
Apple Arcade
1
Sub Categories
Keywords
python
14
cpp
10
gpu
8
deep-learning
8
cuda
8
vulkan
7
machine-learning
6
julia
5
graphics
4
gamedev
4
nvidia
4
game-engine
4
game-development
4
data-science
4
neural-network
3
windows
3
pytorch
3
tensorflow
3
c
3
cplusplus
3
cxx14
3
matlab
3
cpp14
3
neural-networks
3
cross-platform
3
cpp11
3
dotnet
3
cpp17
2
algorithms
2
cpp20
2
azure
2
cxx
2
azure-sdk
2
cxx11
2
docker
2
iot
2
linux
2
cloud
2
metal
2
machine-learning-algorithms
2
vulkan-api
2
android
2
c-plus-plus
2
ios
2
data-visualization
2
numpy
2
nlp
2
java
2
rest
2
developer-tools
2