An open API service indexing awesome lists of open source software.

https://github.com/eliranwong/amd_igpu_ai_setup

AMD iGPU AI Setup and Speed Test - GPD Pocket 4 - Linux + ROCm + AgentMake AI
https://github.com/eliranwong/amd_igpu_ai_setup

agentmake ai amd gpd gpdpocket llamacpp ollama perplexica rocm searxng ubuntu

Last synced: 13 days ago
JSON representation

AMD iGPU AI Setup and Speed Test - GPD Pocket 4 - Linux + ROCm + AgentMake AI

Awesome Lists containing this project

README

          

# AMD_iGPU_AI_Setup

AMD-iGPU-device setup for AI development. We also record some speed test results.

For multiple AMD-GPU setup, please visit https://github.com/eliranwong/MultiAMDGPU_AIDev_Ubuntu

# Tested Device

Device: [GPD Pocket 4](https://gpd.hk/gpdpocket4)

Hardware: AMD Ryzen™ AI 9 HX 370; AMD Radeon™ 890M (RDNA 3.5)

## Memory Setting

BIOS Memory Setting (reboot+DEL key):

UEFI/BIOS -> Advanced -> AMD CBS -> NBIO -> GFX Configuration >

Default settings:

```
iGPU Advanced Control > Disabled
Dedicated Graphics Memory > Medium (16GB)
Remaining System Memory > 48GB
```

Settings for the best performance:

For comparison in performance, please refer to our [speed test results](https://github.com/eliranwong/AMD_iGPU_AI_Setup/blob/main/README.md#speed-tests)

```
iGPU Advanced Control > Enabled
Dedicated Graphics Memory > 32GB
Remaining System Memory > 32GB
```

## Operating System

Operating System: [Ubuntu 24.04.1 LTS](https://ubuntu.com/tutorials/install-ubuntu-desktop#1-overview) (reboot+F7 to install)

> uname -a

```
Linux ai 6.14.0-37-generic #37~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 20 10:25:38 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
```

# Install Basic Tools

> sudo apt update && sudo apt full-upgrade

> sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"

> sudo apt -y install software-properties-common dirmngr apt-transport-https lsb-release ca-certificates apt-utils build-essential make cmake tree wget curl git zip unzip xz-utils nano micro w3m lynx sqlite3 libsqlite3-dev sqlitebrowser libnss3 libnss3-dev libgl1-mesa-dev mesa-utils libglu1-mesa lsb-release binutils ffmpeg gawk opencc plocate gnome-keyring libssl-dev libffi-dev libpci3 libpci-dev python3 python3-setuptools python3-pip python3-dev python3-venv zlib1g-dev libgdbm-dev libreadline-dev libbz2-dev gcc xorg-dev exo-utils dex xdg-utils libavcodec-extra libportaudio2 moreutils llvm tk-dev liblzma-dev python3-openssl libxml2-dev libxmlsec1-dev protobuf-compiler libc6-dev libstdc++-12-dev libxcb-cursor-dev libxcb-cursor0 libncurses-dev libncurses6 ubuntu-restricted-addons ubuntu-restricted-extras gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly xsel portaudio19-dev vlc python3-wheel python3-wheel-whl twine libomp-dev gnome-shell-extension-manager

## Touchscreen Tools

Launch `Extension Manager` and install:

* Screen Rotate

* GJS OSK

# Install ROCM

Find details at: https://rocm.docs.amd.com/projects/radeon-ryzen/en/latest/docs/install/installryz/native_linux/install-ryzen.html

Tested ROCm version: 7.1.1

```
sudo usermod -a -G render,video $LOGNAME
sudo apt update
sudo apt install -y libstdc++-12-dev
wget https://repo.radeon.com/amdgpu-install/7.1.1/ubuntu/noble/amdgpu-install_7.1.1.70101-1_all.deb
sudo apt install ./amdgpu-install_7.1.1.70101-1_all.deb
sudo amdgpu-install --usecase=graphics,multimedia,rocm,rocmdev,rocmdevtools,lrt,opencl,openclsdk,hip,hiplibsdk,openmpsdk,mllib,mlsdk --no-dkms -y
sudo reboot
```

Read more at: https://github.com/eliranwong/MultiAMDGPU_AIDev_Ubuntu/blob/main/README.md

# Check gfx driver version

> rocminfo

Expected output:

```
...
Name: gfx1150
...
```

# Environment variables

Add the following lines to `~/.bashrc`:

```
export ROCM_HOME=/opt/rocm-7.1.1
export LD_LIBRARY_PATH=/opt/rocm-7.1.1/include:/opt/rocm-7.1.1/lib:$LD_LIBRARY_PATH
export PATH=$HOME/.local/bin:/opt/rocm-7.1.1/bin:/opt/rocm-7.1.1/llvm/bin:$PATH
```

```
# rocm
export GFX_ARCH=gfx1150
export HCC_AMDGPU_TARGET=gfx1150
export CUPY_INSTALL_USE_HIP=1
export ROCM_VERSION=7.1
export ROCM_HOME=/opt/rocm
export LD_LIBRARY_PATH=/usr/include/vulkan:/opt/rocm/include:/opt/rocm/lib:/opt/rocm/lib/llvm/lib:/opt/rocm/lib/migraphx/lib:$LD_LIBRARY_PATH
export PATH=/home/eliran/.local/bin:/opt/rocm/bin:/opt/rocm/llvm/bin:$PATH
export HSA_OVERRIDE_GFX_VERSION=11.5.0
#export ROCR_VISIBLE_DEVICES=GPU-XX
export GPU_DEVICE_ORDINAL=0
export HIP_VISIBLE_DEVICES=0
export CUDA_VISIBLE_DEVICES=0
export LLAMA_HIPLAS=0
export DRI_PRIME=0
export OMP_DEFAULT_DEVICE=0
# vulkan
export GGML_VULKAN_DEVICE=0
export GGML_VK_VISIBLE_DEVICES=0
export VULKAN_SDK=/usr/share/vulkan
export VK_LAYER_PATH=$VULKAN_SDK/explicit_layer.d
```

## Remarks about ROCR_VISIBLE_DEVICES

Do NOT set ROCR_VISIBLE_DEVICES for iGPU.

Set it ONLY for discrete GPUs, e.g. https://github.com/eliranwong/MultiAMDGPU_AIDev_Ubuntu#overview

## Remarks about HSA_OVERRIDE_GFX_VERSION:

1. Check `rocminfo` output first

2. General workaround if gfx version is not available in `rocminfo` output:

for GCN 5th gen based GPUs and APUs HSA_OVERRIDE_GFX_VERSION=9.0.0

for RDNA 1 based GPUs and APUs HSA_OVERRIDE_GFX_VERSION=10.1.0

for RDNA 2 based GPUs and APUs HSA_OVERRIDE_GFX_VERSION=10.3.0

for RDNA 3 based GPUs and APUs HSA_OVERRIDE_GFX_VERSION=11.0.0

for RDNA 3.5 based GPUs and APUs HSA_OVERRIDE_GFX_VERSION=11.5.0

3. Read more at: https://llvm.org/docs/AMDGPUUsage.html#processors

In my case, I am running ROCm version 7.1.1.

When I run:

> ls /opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx*.dat

I got the following output:

```
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1030.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1100.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1101.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1102.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1150.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1151.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1200.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1201.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx908.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx90a.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx942.dat
/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx950.dat
```

This confirms that ROCm version 7.1.1 does support `gfx1150`.

# Install Docker Engine

```
# uninstall any old versions
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done

# Add Docker's official GPG key:
sudo apt update
sudo apt install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
sudo tee /etc/apt/sources.list.d/docker.sources < open http://localhost:3000

To open SearXNG, run:

> open http://localhost:4000

# Install Ollama

Standard installation: https://ollama.com/download

Configure Ollama, run:

> sudo nano /etc/systemd/system/ollama.service

Add the following three lines at the end of the [Service] session:

```
Environment="OLLAMA_NUM_PARALLEL=2"
Environment="OLLAMA_MAX_LOADED_MODELS=2"
Environment="OLLAMA_HOST=0.0.0.0"
```

Reload Ollama, run:

> sudo systemctl daemon-reload

> sudo systemctl restart ollama

Add user to group `ollama` for access of Ollama directory:

> sudo usermod -a -G ollama $LOGNAME

> sudo reboot

# Build llama.cpp that runs ROCm backend

Run in terminal:

```
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" cmake -S . -B build -DGGML_HIP=ON -DGGML_HIP_UMA=ON -DAMDGPU_TARGETS=gfx1150 -DCMAKE_BUILD_TYPE=Release && cmake --build build --config Release -- -j $(lscpu | grep -m 1 '^Core(s)' | awk '{print $NF}')
```

Expected lines in the terminal output:

```
...
-- Adding CPU backend variant ggml-cpu: -march=native
-- The HIP compiler identification is Clang 18.0.0
-- Detecting HIP compiler ABI info
-- Detecting HIP compiler ABI info - done
-- Check for working HIP compiler: /opt/rocm-7.1.1/lib/llvm/bin/clang - skipped
-- Detecting HIP compile features
-- Detecting HIP compile features - done
-- HIP and hipBLAS found
-- Including HIP backend
...
```

# Build llama.cpp that runs Vulkan backend

As an alternative to ROCm backend, you may build a copy of llama.cpp that runs Vulkan backend. In our tested device with iGPU, Vulkan backends performs better than ROCm backend. For details, please refer to [speed test results](https://github.com/eliranwong/AMD_iGPU_AI_Setup/blob/main/README.md#speed-tests).

To set up Vulkan driver:

```
sudo apt install glslc glslang-tools glslang-dev mesa-vulkan-drivers vulkan-tools libvulkan-dev libvulkan-memory-allocator-dev libvulkan-volk-dev vulkan-validationlayers vulkan-utility-libraries-dev
```

Install libcurl development headers (if on Ubuntu/Debian):

```
sudo apt-get install libcurl4-openssl-dev
```

To build run:

```
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
cmake -S . -B build -DLLAMA_CURL=ON -DGGML_VULKAN=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build --config Release -- -j $(lscpu | grep -m 1 '^Core(s)' | awk '{print $NF}')
```

Expected lines in the terminal output:

```
...
-- Adding CPU backend variant ggml-cpu: -march=native
-- Found Vulkan: /usr/lib/x86_64-linux-gnu/libvulkan.so (found version "1.3.275") found components: glslc glslangValidator
-- Vulkan found
-- GL_KHR_cooperative_matrix supported by glslc
-- GL_NV_cooperative_matrix2 not supported by glslc
-- Including Vulkan backend
...
```

Make sure you set the vulkan-related variables, e.g. https://github.com/eliranwong/AMD_iGPU_AI_Setup#environment-variables

## Alias for launching llama-server

Run in terminal:

```
cd llama.cpp
echo "alias llamacpp=\"cd /home/$USER/agentmake/models/gguf/ && $(pwd)/build/bin/llama-server --threads $(lscpu | grep -m 1 '^Core(s)' | awk '{print $NF}') -ngl 99 --model\"" >> $HOME/.bashrc
```

Remarks: We add `-ngl 99` to offload as many layers as available to GPU. Using `vulkan` backend, we managed to run `70b` models on the tested device with `-ngl 99` specified. Depending on your hardware, you may need to reduce the value of ngl to load large-sized models.

# Install Open-notebook

```
# Create project directory
mkdir -p ~/dev/open-notebook && cd ~/dev/open-notebook

# Download configuration files
curl -O https://raw.githubusercontent.com/lfnovo/open-notebook/main/docker-compose.full.yml
curl -O https://raw.githubusercontent.com/lfnovo/open-notebook/main/.env.example

# Rename and configure environment
mv .env.example docker.env
```

```
# Edit docker.env with your API keys
nano docker.env
```
# Rename and edit the docker-compose file
mv docker-compose.full.yml docker-compose.yml
# In my case the port `8000` is already used by another service, so
sed -i 's/8000:8000/9000:8000/g' docker-compose.yml

# Start Open Notebook
docker compose up -d

# Make save data accessible
sudo chmod a+w notebook_data/ surreal_data/

# Access open-notebook
open http://localhost:8502
```

# Install Fabric

Run in terminal:

```
mkdir -p ~/.local/bin
cd ~/.local/bin
curl -L https://github.com/danielmiessler/fabric/releases/latest/download/fabric-linux-amd64 > fabric && chmod +x fabric && ./fabric --version
fabric --setup
```

# Install Agentmake AI

Run in terminal:

```
# optional: navigate to home directory
cd
# install in a virtual environment
python3 -m venv ai
source ai/bin/activate
pip install --upgrade agentmake[genai]
echo "export PATH=$PATH:$HOME/ai/bin" >> ~/.bashrc
# To test
ai Hi!
```

## Edit Configurations

To edit configurations, run in terminal:

> ai -ec

## Test with Ollama

> ai Hi!

Remarks: Ollama is set as the default backend, so you can use the `ai` or `aic` commands without specifying the backend option. Run `ai -ec` to edit configurations.

## Test with Llama.cpp

To access model files, downloaded via ollama, add user to group `ollama`:

> sudo usermod -a -G ollama $LOGNAME

> sudo reboot

To download a model via Ollama and save a copy of it in `~/agentmake/models/gguf/` by default, e.g.:

> ai --get_model deepseek-r1 -gm llama3.3:70b -gm aya-expanse

To run an instance of llama-server, assuming that you have set up an alias as mentioned [here](https://github.com/eliranwong/AMD_iGPU_AI_Setup#alias-for-launching-llama-server), e.g.:

> llamacpp deepseek-r1.gguf

To run agentmake with llama.cpp, e.g.:

> ai -b llamacpp Hi!

## Test with Perplexica

To list available tools that work with perplexica, run:

> ai -lt | grep perplexica

Expected output:

```
perplexica/openai
perplexica/groq
perplexica/xai
perplexica/googleai
perplexica/anthropic
perplexica/github
```

To use one of them, e.g.:

> ai -t perplexica/github What is AgentMake AI?

## Test with SearXNG

SearXNG is automatically installed with Perplexica, to get real-time information, e.g.:

> ai -t search/searxng Give me news updates in London today.

## Test with Fabric Integration

Assuming fabric patterns are downloaded, e.g.:

> ai What are AI agents? -sys fabric.write_micro_essay -b genai

## Test with Selected Text in Any Applicaitons

First, make sure `xsel` is installed:

> sudo apt install xsel

Launch `Settings` > Keyboard > View and Customise Shortcuts > Custom Shortcuts > +

Fill in content, like below (replace `username` with your `username`:

```
Name: AgentMake AI
Command: gnome-terminal -- bash -c "/home/username/ai/bin/ai -i -eo -py"
Shift+Ctrl+A
```

![Image](https://github.com/user-attachments/assets/d21fea9a-2288-4e85-96ad-dfbee7ce160d)

Select some text in an application, then press `Shift+Ctrl+A`.

Choose a predefined instruction:

![Image](https://github.com/user-attachments/assets/e4872498-0cef-48e7-a550-55c0c4234929)

Assistant response is automatically copied to clipboard.

Remarks: You can define up to 10 custom instructions for being selected in the dialog, by specifying the values of `CUSTOM_INSTRUCTION_1`, `CUSTOM_INSTRUCTION_2`, `CUSTOM_INSTRUCTION_3`, ... `CUSTOM_INSTRUCTION_10` in AgentMake configurations (run `ai -ec` to edit).

## Note about Azure AI Setup

An easy way to deploy AI models via Azure service:

1. Sign in https://ai.azure.com/github
2. All resources > Create New
3. Overview > copy an API key, Azure OpenAI Service and Azure AI inference endpoints

* Use Azure OpenAI Service endpoint for running OpenAI models; the endpoint should look like https://resource_name.openai.azure.com/

* Use Azure AI inference endpoint for running DeepSeek-R1 and Phi-4; the endpoint should look like https://resource_name.services.ai.azure.com/models

To configure AgentMake AI, run:

> ai -ec

## Note about Vertex AI

Make sure the extra package `genai` is installed with the command mentioned above:

> pip install --upgrade "agentmake[genai]"

To configure, run:

> ai -ec

Enter the path of your Google application credentials JSON file as the value of `VERTEXAI_API_KEY`. You need to specify your project ID and service location, in the configurations, as well. e.g.:

```
VERTEXAI_API_KEY=~/agentmake/google_application_credentials.json
VERTEXAI_API_PROJECT_ID=my_project_id
VERTEXAI_API_SERVICE_LOCATION=us-central1
```

To test Gemini 2.0 with Vertex AI, e.g.:

> ai -b vertexai -m gemini-2.0-flash Hi!

## Using other backends and tools

AgentMake AI supports 14 AI backends and 7 agentic components.

Read more at https://github.com/eliranwong/agentmake

# Install OpenClaw with Discord Integration

[OpenClaw Setup](openclaw.md)

# Speed Tests

Tested with the same prompt `"What is machine learning?"`:

## Dedicated Graphics Memory: 16G

Llama.cpp+ROCM [ Prompt: 448.5 t/s | Generation: 28.7 t/s ]
Llama.cpp+Vulkan [ Prompt: 341.1 t/s | Generation: 29.4 t/s ]
Ollama [ prompt eval rate: 64.58 tokens/s | eval rate: 16.93 tokens/s ]

## Dedicated Graphics Memory: 32G

Llama.cpp+ROCM [ Prompt: 562.1 t/s | Generation: 29.7 t/s ]
Llama.cpp+Vulkan [ Prompt: 410.7 t/s | Generation: 30.4 t/s ]
Ollama [ prompt eval rate: 63.77 tokens/s | eval rate: 16.83 tokens/s ]

## Dedicated Graphics Memory: 48G

Llama.cpp+ROCM [ Prompt: 551.6 t/s | Generation: 29.8 t/s ]
Llama.cpp+Vulkan [ Prompt: 399.0 t/s | Generation: 30.5 t/s ]
Ollama [ prompt eval rate: 21.56 tokens/s | eval rate: 14.92 tokens/s ]