Currently in Active, we calculate and display embeddings using a generic CLIP model. This model is excellent for a wide range of tasks and across the board performance. However, a generic CLIP model may struggle with highly specialized tasks. The embeddings generated from the generic model are used for: Natural Language Search (not supported for custom embeddings), image similarity search and the embeddings view (where you view reduced embeddings).
We currently support embeddings of dimensions ranging from 1 to 4096 for Index, and 1 to 2000 for Active, following on from our in-house clip Embeddings.
We support embeddings for images, image sequences and image groups. Support for Videos is coming soon.
A key is required in your custom metadata schema for your embeddings. You can use any string as the key for your embeddings. We strongly recommend that you use a string that is meaningful.
If you do not include a key in your metadata schema, your imported embeddings are treated as strings.
Embedding key names can contain alphanumeric (a-z, A-Z, 0-1) characters, hyphens, and underscores.
Use add_embedding to add an embedding to your metadata schema.
# Import dependenciesfrom encord import EncordUserClientfrom encord.metadata_schema import MetadataSchemaSSH_PATH = "<file-path-to-ssh-private-key>"# Authenticate with Encord using the path to your private keyuser_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key( ssh_private_key_path=SSH_PATH)# Create the schemametadata_schema = user_client.metadata_schema()# Add embedding fieldsmetadata_schema.add_embedding('my-test-active-embedding', size=512)metadata_schema.add_embedding('my-test-index-embedding', size=<values-from-1-to-4096>)# Save the schemametadata_schema.save()# Print the schema for verificationprint(metadata_schema)
# Import dependenciesfrom encord import EncordUserClientfrom encord.http.bundle import Bundle# AuthenticationSSH_PATH = "<file-path-to-ssh-private-key>"# Authenticate with Encord using the path to your private keyuser_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key( ssh_private_key_path=SSH_PATH,)# Define a dictionary with item UUIDs and their respective metadata updatesupdates = { "<data-ID-1>": {"<my-embedding>": [1.0, 2.0, 3.0]}, "<data-ID-2>": {"<my-embedding>": [1.0, 2.0, 3.0]}}# Use the Bundle context managerwith Bundle() as bundle: # Update the storage items based on the dictionary for item_uuid, metadata_update in updates.items(): item = user_client.get_storage_item(item_uuid=item_uuid) # Make a copy of the current metadata and update it with the new metadata curr_metadata = item.client_metadata.copy() curr_metadata.update(metadata_update) # Update the item with the new metadata and bundle item.update(client_metadata=curr_metadata, bundle=bundle)
# Import dependenciesfrom encord import EncordUserClientfrom encord.http.bundle import Bundlefrom encord.orm.storage import StorageFolder, StorageItem, StorageItemType, FoldersSortBy# AuthenticationSSH_PATH = "<file-path-to-ssh-private-key>"# Authenticate with Encord using the path to your private keyuser_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key( ssh_private_key_path=SSH_PATH,)updates = { "<data-hash-1>": { "$encord": { "frames": { "<frame-number-1>": { "<my-embedding>": [1.0, 2.0, 3.0], # custom embedding ("embedding") with float values }, "<frame-number-2>": { "<my-embedding>": [1.0, 2.0, 3.0], # custom embedding ("embedding") with float values } } } }, "<data-hash-2>": { "$encord": { "config": { "sampling_rate": <samples-per-second>, # VIDEO ONLY (optional default = 1 sample/second) "keyframe_mode": "frame" or "seconds", # VIDEO ONLY (optional default = "frame") }, "frames": { "<frame-number-1>": { "<my-embedding>": [1.0, 2.0, 3.0], # custom embedding ("embedding") with float values }, "<frame-number-2>": { "<my-embedding>": [1.0, 2.0, 3.0], # custom embedding ("embedding") with float values } } } },}# Use the Bundle context managerwith Bundle() as bundle: # Update the storage items based on the dictionary for item_uuid, metadata_update in updates.items(): item = user_client.get_storage_item(item_uuid=item_uuid) # Make a copy of the current metadata and update it with the new metadata curr_metadata = item.client_metadata.copy() curr_metadata.update(metadata_update) # Update the item with the new metadata and bundle item.update(client_metadata=curr_metadata, bundle=bundle)
Before you can use your custom embeddings in Encord Active Projects, you need to import the custom embeddings. This is performed while you import your Annotate Project into Active.
For existing Active Projects, you can import custom embeddings for your Project import if your Project imported to Data or Labels. Importing to Metrics & Embeddings requires deleting the Project in Active and re-importing the Project with your custom embeddings.
After updating your embeddings, sync your Active Project to automatically apply the new embeddings.
To import a Project with custom embeddings:
Log in to the Encord platform.
The landing page for the Encord platform appears.