Projects in Awesome Lists tagged with rocm

https://github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda gpt inference inferentia llama llm llm-serving llmops mlops model-serving pytorch rocm tpu trainium transformer xpu

Last synced: 29 Sep 2024

https://github.com/apache/tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

compiler deep-learning gpu javascript machine-learning metal opencl performance rocm spirv tensor tvm vulkan

Last synced: 29 Sep 2024

https://github.com/cupy/cupy

NumPy & SciPy for GPU

cublas cuda cudnn cupy curand cusolver cusparse cusparselt cutensor gpu nccl numpy nvrtc nvtx python rocm scipy tensor

Last synced: 29 Sep 2024

https://github.com/dmlc/nnvm

computation-graph cuda deep-learning deployment metal nnvm opencl optimization rocm tvm

Last synced: 02 Aug 2024

https://github.com/deepmodeling/deepmd-kit

A deep learning package for many-body potential energy representation and molecular dynamics

ase c computational-chemistry cpp cuda deep-learning deepmd ipi lammps materials-science molecular-dynamics nodejs potential-energy python pytorch rocm tensorflow

Last synced: 30 Sep 2024

https://github.com/stotko/stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU

cpp cpp17 cpp20 cuda data-structures gpgpu gpu gpu-acceleration gpu-computing hip modern-cpp openmp rocm stl stl-containers stl-like

Last synced: 30 Sep 2024

https://github.com/ROCm/ROCm-docker

Dockerfiles for the various software layers defined in the ROCm software platform

docker rocm

Last synced: 01 Aug 2024

https://github.com/agenium-scale/nsimd

Agenium Scale vectorization library for CPUs and GPUs

aarch64 avx avx2 avx512 cpp20 cpp20-library cuda hpc neon neon128 rocm simd simd-instructions simd-library simd-programming sse2 sse42 sve vectorization-library

Last synced: 29 Sep 2024

https://github.com/alpaka-group/alpaka

Abstraction Library for Parallel Kernel Acceleration :llama:

cpp cpp17 cuda gpu header-only heterogeneous-parallel-programming hip hpc openacc openmp rocm tbb

Last synced: 29 Sep 2024

https://github.com/JuliaGPU/AMDGPU.jl

AMD GPU (ROCm) programming in Julia

amdgpu julia rocm

Last synced: 04 Aug 2024

https://github.com/ROCm/MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

amd-opencl amd-opencv amd-openvx computer-vision inference inference-engine khronos-openvx machine-learning neural-network nnef onnx opencl openvx openvx-extensions openvx-neural-network rocm ryzen virtual-reality windows-machine-learning winml

Last synced: 06 Aug 2024

https://github.com/electronic-structure/SIRIUS

Domain specific library for electronic structure calculations

cuda density-functional-theory electronic-structure-calculations full-potential gpu lapw mpi planewave pseudopotential rocm

Last synced: 04 Aug 2024

https://github.com/Grench6/RX580-rocM-tensorflow-ubuntu20.4-guide

Install guide of ROCm and Tensorflow on Ubuntu for the RX580

rocm tensorflow-rocm

Last synced: 07 Aug 2024

https://github.com/sukhmeetbawa/opencl-amd-fedora

AMD OpenCL userspace drivers for Fedora. Currently not working for fedora 37

amd fedora-workstation linux opencl rocm

Last synced: 29 Sep 2024

https://github.com/GPUOpen-Tools/radeon_compute_profiler

The Radeon Compute Profiler (RCP) is a performance analysis tool that gathers data from the API run-time and GPU for OpenCL™ and ROCm/HSA applications. This information can be used by developers to discover bottlenecks in the application and to find ways to optimize the application's performance.

opencl profiler rocm

Last synced: 03 Aug 2024

https://github.com/PennyLaneAI/pennylane-lightning

The PennyLane-Lightning plugin provides a fast state-vector simulator written in C++ for use with PennyLane

cuda distributed-computing gpu hpc mpi openmp parallel quantum-computing quantum-machine-learning rocm

Last synced: 03 Aug 2024

https://github.com/ROCm/rpp

AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/OpenCL/CPU back-ends.

agumentation amd bitwise channel-extract computer-vision contrast cpu gpu hip histogram hpc mivisionx opencl openvx radeon-performance-primitives rocm rpp warp-affine

Last synced: 30 Jul 2024

https://github.com/GPUOpen-ProfessionalCompute-Libraries/rpp

AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/OpenCL/CPU back-ends.

agumentation amd bitwise channel-extract computer-vision contrast cpu gpu hip histogram hpc mivisionx opencl openvx radeon-performance-primitives rocm rpp warp-affine