You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docker run --platform linux/aarch64 --rm -it --name llama.cpp-full -v $PWD/models:/models ghcr.io/ggerganov/llama.cpp:full-b4410 --run -m /models/stories15M_MOE-Q8_0.gguf -p "Building a website can be done in 10 simple steps:"
...
echo$?
139
When executing the run in the container with bash, it additionally prints "Segmentation fault (core dumped)"
docker run --entrypoint /bin/bash --platform linux/aarch64 --rm -it --name llama.cpp-full -v $PWD/models:/models ghcr.io/ggerganov/llama.cpp:full-b4410
./llama-cli -m /models/stories15M_MOE-Q8_0.gguf -p "Building a website can be done in 10 simple steps:"
run with docker on amd64 - it succeeds
docker run --platform linux/amd64 --rm -it --name llama.cpp-full -v $PWD/models:/models ghcr.io/ggerganov/llama.cpp:full-b4410 --run -m /models/stories15M_MOE-Q8_0.gguf -p "Building a website can be done in 10 simple steps:"# use docker stop llama.cpp-full to stop that run
I see the aarch64 run does not load_backend, and amd64 does
< build: 4410 (4b0c638b) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for aarch64-linux-gnu
---
> load_backend: loaded CPU backend from ./libggml-cpu-haswell.so
> build: 4410 (4b0c638b) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
I am facing the same issue. When running the llama.cpp:server image with the --platform linux/arm64 , the server wont start. If I use the --platform linux/amd64 flag the server starts, but is incredibly slow.
Name and Version
version: 4410 (4b0c638)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for aarch64-linux-gnu
Operating systems
Mac
GGML backends
CPU
Hardware
MacOS M1
Models
https://huggingface.co/ggml-org/stories15M_MOE stories15M_MOE-Q8_0.gguf
Problem description & steps to reproduce
download the gguf model
run with docker on aarch64 - it fails
When executing the run in the container with bash, it additionally prints "Segmentation fault (core dumped)"
run with docker on amd64 - it succeeds
I see the aarch64 run does not load_backend, and amd64 does
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: