add block_size arg to chat api #2986

JackWeiw · 2025-01-03T09:41:49Z

Motivation

Since camb device backend only supports block_size=16 for paged_attention relevant ops, we need to enable control of block_size.

This PR add cache_block_seq_len arg parser to lmdeploy chat API.

After this PR, we can use lmdeploy chat /Shanghai_AI_Laboratory/internlm2_5-7b --backend pytorch --device camb --cache-block-seq-len 16 to chat on camb device

RunningLeon · 2025-01-07T11:52:42Z

lmdeploy/cli/cli.py

@@ -256,6 +259,12 @@ def chat(args):
        if backend == 'pytorch':
            from lmdeploy.messages import PytorchEngineConfig
            from lmdeploy.pytorch.chat import run_chat
+            block_size = 64  # default block size
+            if args.device == 'camb':


hi, camb is not in the choices of --device,

lmdeploy/lmdeploy/cli/utils.py

Line 380 in c6c25ae

choices: List[str] = ['cuda', 'ascend', 'maca']):

Yes, camb is on WIP, will be merged as part of dlinfer backend in the future, we can apply this PR after camb main work is merged

add block_size arg to chat api

dc4bfb9

jinminxi104 marked this pull request as draft January 6, 2025 06:59

Merge branch 'InternLM:main' into wt/chat_args

7b00dde

grimoire requested a review from RunningLeon January 7, 2025 07:35

feat change block_size to 16 according to camb device

edf3251

RunningLeon reviewed Jan 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add block_size arg to chat api #2986

add block_size arg to chat api #2986

JackWeiw commented Jan 3, 2025

RunningLeon Jan 7, 2025

JackWeiw Jan 7, 2025

RunningLeon Jan 8, 2025

add block_size arg to chat api #2986

Are you sure you want to change the base?

add block_size arg to chat api #2986

Conversation

JackWeiw commented Jan 3, 2025

Motivation

RunningLeon Jan 7, 2025

Choose a reason for hiding this comment

JackWeiw Jan 7, 2025

Choose a reason for hiding this comment

RunningLeon Jan 8, 2025

Choose a reason for hiding this comment