health_multimodal.image.inference_engine

Classes

ImageInferenceEngine(image_model, transform)

Encapsulate inference-time operations on an image model.

class health_multimodal.image.inference_engine.ImageInferenceEngine(image_model, transform)[source]

Encapsulate inference-time operations on an image model.

Parameters
  • img_model – Trained image model

  • transform (Compose) – Transform to apply to the image after loading. Must return a torch.Tensor that can be input directly to the image model.

get_projected_global_embedding(image_path)[source]

Compute global image embedding in the joint latent space.

Parameters

image_path (Path) – Path to the image to compute embeddings for.

Return type

Tensor

Returns

Torch tensor containing l2-normalised global image embedding [joint_feature_dim,] where joint_feature_dim is the dimensionality of the joint latent space.

get_projected_patch_embeddings(image_path)[source]

Compute image patch embeddings in the joint latent space, preserving the image grid.

Parameters

image_path (Path) – Path to the image to compute embeddings for.

Return type

Tuple[Tensor, Tuple[int, int]]

Returns

A tuple containing the image patch embeddings and the shape of the original image (width, height) before applying transforms.

load_and_transform_input_image(image_path, transform)[source]

Read an image and apply the transform to it.

  1. Read the image from the given path

  2. Apply transform

  3. Add the batch dimension

  4. Move to the correct device

Parameters

return_original_shape – Whether to return an extra tuple that has the original shape of the image before the transforms. The tuple returned contains (width, height).

Return type

Tuple[Tensor, Tuple[int, int]]