Skip to content

Commit

Permalink
Add instructions for llama.cpp (#16)
Browse files Browse the repository at this point in the history
* Add llama.cpp usage instructions

* Quick fix to admonition title syntax
  • Loading branch information
danbarr authored Dec 20, 2024
1 parent 6eac9cd commit bf14bd7
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 3 deletions.
35 changes: 33 additions & 2 deletions docs/how-to/use-with-continue.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -273,8 +273,39 @@ Replace `YOUR_API_KEY` with your
</TabItem>
<TabItem value="llamacpp" label="llama.cpp">

Replace `MODEL_NAME` with the name of a model you have available locally with
`llama.cpp`, such as `qwen2.5-coder-1.5b-instruct-q5_k_m`.
:::note Performance

Docker containers on macOS cannot access the GPU, which impacts the performance
of llama.cpp in CodeGate. For better performance on macOS, we recommend using a
standalone Ollama installation.

:::

CodeGate has built-in support for llama.ccp. This is considered an advanced
option, best suited to quick experimentation with various coding models.

To use this provider, download your desired model file in GGUF format from the
[Hugging Face library](https://huggingface.co/models?library=gguf&sort=trending).
Then copy it into the `/app/codegate_volume/models` directory in the CodeGate
container. To persist models between restarts, run CodeGate with a Docker
volume as shown in the [recommended configuration](./install.md#recommended-settings).

Example using huggingface-cli to download our recommended models for chat (at
least a 7B model is recommended for best results) and autocomplete (a 1.5B or 3B
model is recommended for performance):

```bash
# For chat functions
huggingface-cli download Qwen/Qwen2.5-7B-Instruct-GGUF qwen2.5-7b-instruct-q5_k_m.gguf --local-dir .
docker cp qwen2.5-7b-instruct-q5_k_m.gguf codegate:/app/codegate_volume/models/

# For autocomplete functions
huggingface-cli download Qwen/Qwen2.5-1.5B-Instruct-GGUF qwen2.5-1.5b-instruct-q5_k_m.gguf --local-dir .
docker cp qwen2.5-1.5b-instruct-q5_k_m.gguf codegate:/app/codegate_volume/models/
```

In the Continue config file, replace `MODEL_NAME` with the file name without the
.gguf extension, for example `qwen2.5-coder-7b-instruct-q5_k_m`.

```json title="~/.continue/config.json"
{
Expand Down
2 changes: 1 addition & 1 deletion docs/quickstart-copilot.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ browser: [http://localhost:9090](http://localhost:9090)
To enable CodeGate, you must install its Certificate Authority (CA) into your
certificate trust store.

:::info[Why is this needed?]
:::info Why is this needed?

The CA certificate allows CodeGate to securely intercept and modify traffic
between GitHub Copilot and your IDE. Decrypted traffic never leaves your local
Expand Down

0 comments on commit bf14bd7

Please sign in to comment.