Skip to content

v0.5.2

Latest
Compare
Choose a tag to compare
@ajindal1 ajindal1 released this 26 Nov 18:05
27bcf6c

Release Notes

Patch release 0.5.2 adds:

  • Fixes for bugs #1074, #1092 via PRs #1065 and #1070
  • Fix Nuget sample in package README to show correct disposal of objects
  • Added extra validation via PRs #1050 #1066

Features in 0.5.0:

  • Support for MultiLoRA
  • Support for multi-frame for Phi-3 vision and Phi-3.5 vision models
  • Support for the Phi-3 MoE model
  • Support for NVIDIA Nemotron model
  • Support for the Qwen model
  • Addition of the Set Terminate feature, which allows users to cancel mid-generation
  • Soft capping support for Group Query Attention
  • Extend quantization support to embedding and LM head layers
  • Mac support in published packages

Known issues

  • Models running with DirectML do not support batching
  • Python 3.13 is not supported in this release