health_multimodal.image.inference_engine

Classes

ImageInferenceEngine(image_model, transform)

Encapsulate inference-time operations on an image model.

class health_multimodal.image.inference_engine.ImageInferenceEngine(image_model, transform)[source]

Encapsulate inference-time operations on an image model.

Parameters

img_model – Trained image model
transform (Compose) – Transform to apply to the image after loading. Must return a torch.Tensor that can be input directly to the image model.

get_projected_global_embedding(image_path)[source]

Compute global image embedding in the joint latent space.

Parameters: image_path (Path) – Path to the image to compute embeddings for.
Return type: Tensor
Returns: Torch tensor containing l2-normalised global image embedding [joint_feature_dim,] where joint_feature_dim is the dimensionality of the joint latent space.

get_projected_patch_embeddings(image_path)[source]

Compute image patch embeddings in the joint latent space, preserving the image grid.

Parameters: image_path (Path) – Path to the image to compute embeddings for.
Return type: Tuple[Tensor, Tuple[int, int]]
Returns: A tuple containing the image patch embeddings and the shape of the original image (width, height) before applying transforms.

load_and_transform_input_image(image_path, transform)[source]

Read an image and apply the transform to it.

Parameters: return_original_shape – Whether to return an extra tuple that has the original shape of the image before the transforms. The tuple returned contains (width, height).
Return type: Tuple[Tensor, Tuple[int, int]]