Performance of llama.cpp with Vulkan #10879
Replies: 9 comments 8 replies
-
AMD FirePro W8100
|
Beta Was this translation helpful? Give feedback.
-
AMD RX 470
|
Beta Was this translation helpful? Give feedback.
-
ubuntu 24.04, vulkan and cuda installed from official APT packages.
build: 4da69d1 (4351) vs CUDA on the same build/setup
build: 4da69d1 (4351) |
Beta Was this translation helpful? Give feedback.
-
Macbook Air M2 on Asahi Linux ggml_vulkan: Found 1 Vulkan devices:
|
Beta Was this translation helpful? Give feedback.
-
Gentoo Linux on ROG Ally (2023) Ryzen Z1 Extreme ggml_vulkan: Found 1 Vulkan devices:
|
Beta Was this translation helpful? Give feedback.
-
ggml_vulkan: Found 4 Vulkan devices:
|
Beta Was this translation helpful? Give feedback.
-
build: 0d52a69 (4439) NVIDIA GeForce RTX 3090 (NVIDIA)
AMD Radeon RX 6800 XT (RADV NAVI21) (radv)
AMD Radeon (TM) Pro VII (RADV VEGA20) (radv)
Intel(R) Arc(tm) A770 Graphics (DG2) (Intel open-source Mesa driver)
|
Beta Was this translation helpful? Give feedback.
-
@netrunnereve Some of the tg results here are a little low, I think they might be debug builds. The cmake step (at least on Linux) might require |
Beta Was this translation helpful? Give feedback.
-
Build: 8d59d91 (4450)
Lack of proper Xe coopmat support in the ANV driver is a setback honestly.
edit: retested both with the default batch size. |
Beta Was this translation helpful? Give feedback.
-
This is similar to the Apple Silicon benchmark thread, but for Vulkan! Many improvements have been made to the Vulkan backend in the past month and I think it's good to consolidate and discuss our results here.
We'll be testing the Llama 2 7B model like the other thread to keep things consistent, and use Q4_0 as it's simple to compute and small enough to fit on a 4GB GPU. You can download it here.
Instructions
Share your llama-bench results along with the git hash and Vulkan info string in the comments. Feel free to try other models, compare backends, and so forth, but only valid runs will be placed on the scoreboard.
Vulkan Scoreboard for Llama 2 7B, Q4_0 (sorted by tg128)
Beta Was this translation helpful? Give feedback.
All reactions