Azure · dargilco · Dec 23, 2024 · Dec 26, 2024 · Dec 27, 2024 · Dec 27, 2024
@@ -2,6 +2,10 @@
 
 ## 1.0.0b7 (Unreleased)
 
+### Features Added
+
+* Added a client for Image Embeddings, named `ImageEmbeddingsClient`. See package README.md and new samples.
+
 ### Bugs Fixed
 
 * Fix a bug that would cause an error when tracing was enabled and azure-core-tracing-opentelemetry was not installed and asynchronous chat completion was used.

@@ -6,7 +6,7 @@ Use the Inference client library (in preview) to:
 * Get information about the AI model
 * Do chat completions
 * Get text embeddings
-<!-- * Get image embeddings -->
+* Get image embeddings
 
 The Inference client library supports AI models deployed to the following services:
 
@@ -219,17 +219,15 @@ See simple chat completion examples below. More can be found in the [samples](ht
 
 ### Text Embeddings
 
-The `EmbeddingsClient` has a method named `embedding`. The method makes a REST API call to the `/embeddings` route on the provided endpoint, as documented in [the REST API reference](https://learn.microsoft.com/azure/ai-studio/reference/reference-model-inference-embeddings).
+The `EmbeddingsClient` has a method named `embed`. The method makes a REST API call to the `/embeddings` route on the provided endpoint, as documented in [the REST API reference](https://learn.microsoft.com/azure/ai-studio/reference/reference-model-inference-embeddings).
 
 See simple text embedding example below. More can be found in the [samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) folder.
 
-<!--
 ### Image Embeddings
 
-TODO: Add overview and link to explain image embeddings.
+The `ImageEmbeddingsClient` has a method named `embed`. The method makes a REST API call to the `/images/embeddings` route on the provided endpoint, as documented in [the REST API reference](https://learn.microsoft.com/azure/ai-studio/reference/reference-model-inference-images-embeddings).
 
-Embeddings operations target the URL route `images/embeddings` on the provided endpoint.
--->
+See simple image embedding example below. More can be found in the [samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) folder.
 
 ## Examples
 
@@ -239,7 +237,7 @@ In the following sections you will find simple examples of:
 * [Streaming chat completions](#streaming-chat-completions-example)
 * [Chat completions with additional model-specific parameters](#chat-completions-with-additional-model-specific-parameters)
 * [Text Embeddings](#text-embeddings-example)
-<!-- * [Image Embeddings](#image-embeddings-example) -->
+* [Image Embeddings](#image-embeddings-example)
 
 The examples create a synchronous client assuming a Serverless API or Managed Compute endpoint. Modify client
 construction code as descirbed in [Key concepts](#key-concepts) to have it work with GitHub Models endpoint or Azure OpenAI
@@ -412,6 +410,39 @@ data[2]: length=1024, [0.04196167, 0.029083252, ..., -0.0027484894, 0.0073127747
 
 To generate embeddings for additional phrases, simply call `client.embed` multiple times using the same `client`.
 
+### Image Embeddings example
+
+This example demonstrates how to get image embeddings, for a Serverless API or Managed Compute endpoint, with key authentication, assuming `endpoint` and `key` are already defined. For Entra ID authentication, GitHub models endpoint or Azure OpenAI endpoint, modify the code to create the client as specified in the above sections.
+
+<!-- SNIPPET:sample_image_embeddings.image_embeddings -->
+
+```python
+from azure.ai.inference import ImageEmbeddingsClient
+from azure.ai.inference.models import ImageEmbeddingInput
+from azure.core.credentials import AzureKeyCredential
+
+client = ImageEmbeddingsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
+
+response = client.embed(input=[ImageEmbeddingInput.load(image_file="sample1.png", image_format="png")])
+
+for item in response.data:
+    length = len(item.embedding)
+    print(
+        f"data[{item.index}]: length={length}, [{item.embedding[0]}, {item.embedding[1]}, "
+        f"..., {item.embedding[length-2]}, {item.embedding[length-1]}]"
+    )
+```
+
+<!-- END SNIPPET -->
+
+The length of the embedding vector depends on the model, but you should see something like this:
+
+```text
+data[0]: length=1024, [0.0103302, -0.04425049, ..., -0.011543274, -0.0009088516]
+```
+
+To generate image embeddings for additional images, simply call `client.embed` multiple times using the same `client`.
+
 <!--
 ### Image Embeddings example
 
@@ -421,7 +452,7 @@ This example demonstrates how to get image embeddings.
 
 ```python
 from azure.ai.inference import ImageEmbeddingsClient
-from azure.ai.inference.models import EmbeddingInput
+from azure.ai.inference.models import ImageEmbeddingInput
 from azure.core.credentials import AzureKeyCredential
 
 with open("sample1.png", "rb") as f:
@@ -431,7 +462,7 @@ with open("sample2.png", "rb") as f:
 
 client = ImageEmbeddingsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
 
-response = client.embed(input=[EmbeddingInput(image=image1), EmbeddingInput(image=image2)])
+response = client.embed(input=[ImageEmbeddingInput(image=image1), ImageEmbeddingInput(image=image2)])
 
 for item in response.data:
     length = len(item.embedding)

@@ -2,5 +2,5 @@
   "AssetsRepo": "Azure/azure-sdk-assets",
   "AssetsRepoPrefixPath": "python",
   "TagPrefix": "python/ai/azure-ai-inference",
-  "Tag": "python/ai/azure-ai-inference_f4719b1cae"
+  "Tag": "python/ai/azure-ai-inference_a1cd8e128d"
 }
@@ -703,7 +703,7 @@ def _embed(
     def _embed(
         self,
         *,
-        input: List[_models.EmbeddingInput],
+        input: List[_models.ImageEmbeddingInput],
         extra_params: Optional[Union[str, _models._enums.ExtraParameters]] = None,
         content_type: str = "application/json",
         dimensions: Optional[int] = None,
@@ -727,7 +727,7 @@ def _embed(
         self,
         body: Union[JSON, IO[bytes]] = _Unset,
         *,
-        input: List[_models.EmbeddingInput] = _Unset,
+        input: List[_models.ImageEmbeddingInput] = _Unset,
         extra_params: Optional[Union[str, _models._enums.ExtraParameters]] = None,
         dimensions: Optional[int] = None,
         encoding_format: Optional[Union[str, _models.EmbeddingEncodingFormat]] = None,
@@ -743,7 +743,7 @@ def _embed(
         :keyword input: Input image to embed. To embed multiple inputs in a single request, pass an
          array.
          The input must not exceed the max input tokens for the model. Required.
-        :paramtype input: list[~azure.ai.inference.models.EmbeddingInput]
+        :paramtype input: list[~azure.ai.inference.models.ImageEmbeddingInput]
         :keyword extra_params: Controls what happens if extra parameters, undefined by the REST API,
          are passed in the JSON request payload.
          This sets the HTTP request header ``extra-parameters``. Known values are: "error", "drop", and

@@ -1053,7 +1053,7 @@ def __init__(
     def embed(
         self,
         *,
-        input: List[_models.EmbeddingInput],
+        input: List[_models.ImageEmbeddingInput],
         dimensions: Optional[int] = None,
         encoding_format: Optional[Union[str, _models.EmbeddingEncodingFormat]] = None,
         input_type: Optional[Union[str, _models.EmbeddingInputType]] = None,
@@ -1067,7 +1067,7 @@ def embed(
         :keyword input: Input image to embed. To embed multiple inputs in a single request, pass an
          array.
          The input must not exceed the max input tokens for the model. Required.
-        :paramtype input: list[~azure.ai.inference.models.EmbeddingInput]
+        :paramtype input: list[~azure.ai.inference.models.ImageEmbeddingInput]
         :keyword dimensions: Optional. The number of dimensions the resulting output embeddings should
          have. Default value is None.
         :paramtype dimensions: int
@@ -1139,7 +1139,7 @@ def embed(
         self,
         body: Union[JSON, IO[bytes]] = _Unset,
         *,
-        input: List[_models.EmbeddingInput] = _Unset,
+        input: List[_models.ImageEmbeddingInput] = _Unset,
         dimensions: Optional[int] = None,
         encoding_format: Optional[Union[str, _models.EmbeddingEncodingFormat]] = None,
         input_type: Optional[Union[str, _models.EmbeddingInputType]] = None,
@@ -1157,7 +1157,7 @@ def embed(
         :keyword input: Input image to embed. To embed multiple inputs in a single request, pass an
          array.
          The input must not exceed the max input tokens for the model. Required.
-        :paramtype input: list[~azure.ai.inference.models.EmbeddingInput]
+        :paramtype input: list[~azure.ai.inference.models.ImageEmbeddingInput]
         :keyword dimensions: Optional. The number of dimensions the resulting output embeddings should
          have. Default value is None.
         :paramtype dimensions: int

@@ -572,7 +572,7 @@ async def _embed(
     async def _embed(
         self,
         *,
-        input: List[_models.EmbeddingInput],
+        input: List[_models.ImageEmbeddingInput],
         extra_params: Optional[Union[str, _models._enums.ExtraParameters]] = None,
         content_type: str = "application/json",
         dimensions: Optional[int] = None,
@@ -596,7 +596,7 @@ async def _embed(
         self,
         body: Union[JSON, IO[bytes]] = _Unset,
         *,
-        input: List[_models.EmbeddingInput] = _Unset,
+        input: List[_models.ImageEmbeddingInput] = _Unset,
         extra_params: Optional[Union[str, _models._enums.ExtraParameters]] = None,
         dimensions: Optional[int] = None,
         encoding_format: Optional[Union[str, _models.EmbeddingEncodingFormat]] = None,
@@ -612,7 +612,7 @@ async def _embed(
         :keyword input: Input image to embed. To embed multiple inputs in a single request, pass an
          array.
          The input must not exceed the max input tokens for the model. Required.
-        :paramtype input: list[~azure.ai.inference.models.EmbeddingInput]
+        :paramtype input: list[~azure.ai.inference.models.ImageEmbeddingInput]
         :keyword extra_params: Controls what happens if extra parameters, undefined by the REST API,
          are passed in the JSON request payload.
          This sets the HTTP request header ``extra-parameters``. Known values are: "error", "drop", and

@@ -86,7 +86,7 @@ async def load_client(
             "The AI model information is missing a value for `model type`. Cannot create an appropriate client."
         )
 
-    # TODO: Remove "completions" and "embedding" once Mistral Large and Cohere fixes their model type
+    # TODO: Remove "completions", "chat-comletions" and "embedding" once Mistral Large and Cohere fixes their model type
     if model_info.model_type in (_models.ModelType.CHAT, "completion", "chat-completion", "chat-completions"):
         chat_completion_client = ChatCompletionsClient(endpoint, credential, **kwargs)
         chat_completion_client._model_info = (  # pylint: disable=protected-access,attribute-defined-outside-init
@@ -1035,7 +1035,7 @@ def __init__(
     async def embed(
         self,
         *,
-        input: List[_models.EmbeddingInput],
+        input: List[_models.ImageEmbeddingInput],
         dimensions: Optional[int] = None,
         encoding_format: Optional[Union[str, _models.EmbeddingEncodingFormat]] = None,
         input_type: Optional[Union[str, _models.EmbeddingInputType]] = None,
@@ -1049,7 +1049,7 @@ async def embed(
         :keyword input: Input image to embed. To embed multiple inputs in a single request, pass an
          array.
          The input must not exceed the max input tokens for the model. Required.
-        :paramtype input: list[~azure.ai.inference.models.EmbeddingInput]
+        :paramtype input: list[~azure.ai.inference.models.ImageEmbeddingInput]
         :keyword dimensions: Optional. The number of dimensions the resulting output embeddings should
          have. Default value is None.
         :paramtype dimensions: int
@@ -1121,7 +1121,7 @@ async def embed(
         self,
         body: Union[JSON, IO[bytes]] = _Unset,
         *,
-        input: List[_models.EmbeddingInput] = _Unset,
+        input: List[_models.ImageEmbeddingInput] = _Unset,
         dimensions: Optional[int] = None,
         encoding_format: Optional[Union[str, _models.EmbeddingEncodingFormat]] = None,
         input_type: Optional[Union[str, _models.EmbeddingInputType]] = None,
@@ -1139,7 +1139,7 @@ async def embed(
         :keyword input: Input image to embed. To embed multiple inputs in a single request, pass an
          array.
          The input must not exceed the max input tokens for the model. Required.
-        :paramtype input: list[~azure.ai.inference.models.EmbeddingInput]
+        :paramtype input: list[~azure.ai.inference.models.ImageEmbeddingInput]
         :keyword dimensions: Optional. The number of dimensions the resulting output embeddings should
          have. Default value is None.
         :paramtype dimensions: int

@@ -20,7 +20,7 @@
 from ._models import ChatResponseMessage
 from ._models import CompletionsUsage
 from ._models import ContentItem
-from ._models import EmbeddingInput
+from ._patch import ImageEmbeddingInput
 from ._models import EmbeddingItem
 from ._patch import EmbeddingsResult
 from ._models import EmbeddingsUsage
@@ -67,7 +67,7 @@
     "ChatResponseMessage",
     "CompletionsUsage",
     "ContentItem",
-    "EmbeddingInput",
+    "ImageEmbeddingInput",
     "EmbeddingItem",
     "EmbeddingsResult",
     "EmbeddingsUsage",

@@ -584,12 +584,13 @@ def __init__(self, *args: Any, **kwargs: Any) -> None:  # pylint: disable=useles
         super().__init__(*args, **kwargs)
 
 
-class EmbeddingInput(_model_base.Model):
+class ImageEmbeddingInput(_model_base.Model):
     """Represents an image with optional text.
 
     All required parameters must be populated in order to send to server.
 
-    :ivar image: The input image, in PNG format. Required.
+    :ivar image: The input image encoded in base64 string as a data URL.
+     Example: `data:image/{format};base64,{data}`. Required.
     :vartype image: str
     :ivar text: Optional. The text input to feed into the model (like DINO, CLIP).
      Returns a 422 error if the model doesn't support the value or parameter.

@@ -19,6 +19,7 @@
 from ._models import ImageUrl as ImageUrlGenerated
 from ._models import ChatCompletions as ChatCompletionsGenerated
 from ._models import EmbeddingsResult as EmbeddingsResultGenerated
+from ._models import ImageEmbeddingInput as EmbeddingInputGenerated
 from .. import models as _models
 
 if sys.version_info >= (3, 11):
@@ -106,6 +107,32 @@ def load(
         return cls(url=url, detail=detail)
 
 
+class ImageEmbeddingInput(EmbeddingInputGenerated):
+
+    @classmethod
+    def load(cls, *, image_file: str, image_format: str, text: Optional[str] = None) -> Self:
+        """
+        Create an ImageEmbeddingInput object from a local image file. The method reads the image
+        file and encodes it as a base64 string, which together with the image format
+        is then used to format the JSON `url` value passed in the request payload.
+
+        :ivar image_file: The name of the local image file to load. Required.
+        :vartype image_file: str
+        :ivar image_format: The MIME type format of the image. For example: "jpeg", "png". Required.
+        :vartype image_format: str
+        :ivar text: Optional. The text input to feed into the model (like DINO, CLIP).
+         Returns a 422 error if the model doesn't support the value or parameter.
+        :vartype text: str
+        :return: An ImageEmbeddingInput object with the image data encoded as a base64 string.
+        :rtype: ~azure.ai.inference.models.EmbeddingsInput
+        :raises FileNotFoundError: when the image file could not be opened.
+        """
+        with open(image_file, "rb") as f:
+            image_data = base64.b64encode(f.read()).decode("utf-8")
+        image_uri = f"data:image/{image_format};base64,{image_data}"
+        return cls(image=image_uri, text=text)
+
+
 class BaseStreamingChatCompletions:
     """A base class for the sync and async streaming chat completions responses, holding any common code
     to deserializes the Server Sent Events (SSE) response stream into chat completions updates, each one
@@ -268,6 +295,7 @@ async def aclose(self) -> None:
 
 __all__: List[str] = [
     "ImageUrl",
+    "ImageEmbeddingInput",
     "ChatCompletions",
     "EmbeddingsResult",
     "StreamingChatCompletions",

@@ -56,10 +56,9 @@ Note that the client library does not directly read these environment variable a
 | Sample type | Endpoint environment variable name | Key environment variable name  |
 |----------|----------|----------|
 | Chat completions | `AZURE_AI_CHAT_ENDPOINT` | `AZURE_AI_CHAT_KEY` |
-| Embeddings | `AZURE_AI_EMBEDDINGS_ENDPOINT` | `AZURE_AI_EMBEDDINGS_KEY` |
-<!--
-| Image generation | `IMAGE_GENERATION_ENDPOINT` | `IMAGE_GENERATION_KEY` |
--->
+| Test embeddings | `AZURE_AI_EMBEDDINGS_ENDPOINT` | `AZURE_AI_EMBEDDINGS_KEY` |
+| Image embeddings | `AZURE_AI_IMAGE_EMBEDDINGS_ENDPOINT` | `AZURE_AI_IMAGE_EMBEDDINGS_KEY` |
+
 
 To run against a Managed Compute Endpoint, some samples also have an optional environment variable `AZURE_AI_CHAT_DEPLOYMENT_NAME`. This is the value used to set the HTTP request header `azureml-model-deployment` when constructing the client.
 
@@ -118,13 +117,12 @@ similarly for the other samples.
 |[sample_embeddings_with_defaults.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_embeddings_with_defaults.py) | One embeddings operation using a synchronous client, with default embeddings configuration set in the client constructor. |
 |[sample_embeddings_azure_openai.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_embeddings_azure_openai.py) | One embeddings operation using a synchronous client, against Azure OpenAI endpoint. |
 
-<!--
 ### Image embeddings
 
 |**File Name**|**Description**|
 |----------------|-------------|
-|[sample_image_embeddings.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_image_embeddings.py) | One image embeddings operation, on two input images, using a synchronous client. |
--->
+|[sample_image_embeddings.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_image_embeddings.py) | One image embeddings operation, using a synchronous client. |
+|[sample_image_embeddings_with_defaults.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_image_embeddings_with_defaults.py) | One image embeddings operation using a synchronous client, with default embeddings configuration set in the client constructor. |
 
 ## Asynchronous client samples
 
@@ -145,13 +143,11 @@ similarly for the other samples.
 |----------------|-------------|
 |[sample_embeddings_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_embeddings_async.py) | One embeddings operation using an asynchronous client. |
 
-<!--
 ### Image embeddings
 
 |**File Name**|**Description**|
 |----------------|-------------|
-|[sample_image_embeddings_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_image_embeddings_async.py) | One image embeddings operation, on two input images, using an asynchronous client. |
--->
+|[sample_image_embeddings_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_image_embeddings_async.py) | One image embeddings operation, using an asynchronous client. |
 
 ## Troubleshooting