GPTQ - Move `quantized_model` to CUDA device #1535

samuel100 · 2024-12-31T10:38:48Z

Describe your changes

When using GPTQ the quantized_model must be moved to the CUDA device to avoid the "Expected all tensors to be on the same device" error in auto-gptq. See AutoGPTQ/AutoGPTQ#729

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running lintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

xiaoyu-work · 2025-01-03T05:55:23Z

According to the discussion thread, it seems this was already fixed by AutoGPTQ/AutoGPTQ#607?

samuel100 added 3 commits December 30, 2024 17:15

explicity move quantized model to cuda device

4218e36

add gh issue to comment

307c960

fixed linting

b474b8b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQ - Move `quantized_model` to CUDA device #1535

GPTQ - Move `quantized_model` to CUDA device #1535

samuel100 commented Dec 31, 2024

xiaoyu-work commented Jan 3, 2025

GPTQ - Move quantized_model to CUDA device #1535

Are you sure you want to change the base?

GPTQ - Move quantized_model to CUDA device #1535

Conversation

samuel100 commented Dec 31, 2024

Describe your changes

Checklist before requesting a review

xiaoyu-work commented Jan 3, 2025

GPTQ - Move `quantized_model` to CUDA device #1535

GPTQ - Move `quantized_model` to CUDA device #1535