Getting Started with Encord
Modalities
- Audio
- Images and Videos
- Text
Custom Editor Layout
Benchmark QA Workflow
Annotator Training
Follow these steps to set up annotator training in Encord. The annotator training workflow enables you to assess the accuracy and performance of your annotation workforce.
Overview
-
Create a Benchmark Project Establish ground truth labels by having a trusted expert annotate the data. This project must be completed before annotator training begins.
-
Set up Annotator Training Projects You must create one training Project per annotator. Use the same Dataset as the Benchmark Project in each annotator training Project. Annotators label the data, and their work is compared against the gold standard created in the benchmark.
-
Annotators Label Training Tasks Annotators must complete all the training tasks assigned to them.
-
Evaluate Annotator Performance Use the provided SDK script to compare annotator labels with the benchmark. Analyze the results to assess accuracy and provide targeted feedback.
STEP 1: Add Files to Encord
You must first add your data Encord. These files are used to train your annotators.
Create a Cloud Integration
Create a Folder to Store your Files
- Navigate to Files under the Index heading in the Encord platform.
- Click the + New folder button to create a new folder. A dialog to create a new folder appears.
-
Give the folder a meaningful name and description.
-
Click Create to create the folder. The folder is listed in Files.
Create JSON file for Registration
To register files from cloud storage into Encord, you must create a JSON file specifying the files you want to upload.
While you can use a CSV file, we strongly recommend using JSON files for uploading cloud data to Encord for better compatibility and performance.
Find helpful scripts for creating JSON files for the data registration process here.
All types of data (videos, images, image groups, image sequences, and DICOM) from a private cloud are added to a Dataset in the same way, by using a JSON or CSV file. The file includes links to all images, image groups, videos and DICOM files in your cloud storage.
Encord enforces the following upload limits for each JSON file used for file registration:
- Up to 1 million URLs
- A maximum of 500,000 items (e.g. images, image groups, videos, DICOMs)
- URLs can be up to 16 KB in size
Optimal upload chunking can vary depending on your data type and the amount of associated metadata. For tailored recommendations, contact Encord support. We recommend starting with smaller uploads and gradually increasing the size based on how quickly jobs are processed. Generally, smaller chunks result in faster data reflection within the platform.
Create JSON file for Registration
For detailed information about the JSON file format used for import go here.
The information provided about each of the following data types is designed to get you up and running as quickly as possible without going too deeply into the why or how. Look at the template for each data type, then the examples, and adjust the examples to suit your needs.
skip_duplicate_urls
is set to true
, all object URLs that exactly match existing images/videos in the dataset are skipped.AWS JSON
Videos
sampling_rate
to 0
. This imports only the first frame and any key frames you specify in the video. This can significantly speed up the import of your data into Active and Index and help you to focus on only data you identify as critical.The following table provides some guidance for the examples provided after the table.
Title | Description |
---|---|
Template | Provides the proper JSON format to import videos into Encord. This template provides examples from the most basic to the most complex. |
Data | Imports videos into Encord. Why would I do this?
|
Key Frames | Imports videos with an Encord title and specifies key frames (frames of interest) for Active and Index. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If |
Custom Metadata | Imports videos with an Encord title, specifies key frames (frames of interest), and custom metadata for Active and Index. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If |
Embeddings | Imports videos with an Encord title, specifies key frames (frames of interest), custom metadata, and custom embeddings for Active and Index. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If Refer to our documentation for more information about Index with Custom Metadata, Index with Custom Embeddings, Active with Custom Metadata and Active with Custom Embeddings. |
Video Metadata | Imports videos with the videoMetadata flag. When the videoMetadata flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate. |
{
"videos": [
{
"objectUrl": "cloud-path-to-your-video-1"
},
{
"objectUrl": "cloud-path-to-your-video-2",
"videoMetadata": {
"fps": frames-per-second,
"duration": duration-in-seconds,
"width": frame-width,
"height": frame-height,
"file_size": file-size-in-bytes,
"mime_type": "MIME-file-type-extension"
}
}
{
"objectUrl": "cloud-path-to-your-video-3",
"title": "title-for-your-video-3",
"clientMetadata": {"metadata-1": "value", "metadata-2": "value"}
},
{
"objectUrl": "cloud-path-to-your-video-4",
"title": "title-for-your-video-4",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"frames": ["<frame-number-1>","<frame-number-2>","<frame-number-3>"]
}
}
},
{
"objectUrl": "cloud-path-to-your-video-5",
"title": "title-for-your-video-5",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"frames": {
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
},
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
}
}
},
{
"objectUrl": "cloud-path-to-your-video-6",
"title": "title-for-your-video-5",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"config": {
"sampling_rate": "<samples-per-second>",
"keyframe_mode": "frame" or "seconds",
},
"frames": {
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
},
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
}
}
}
],
"skip_duplicate_urls": true
}
Audio Files
The following is an example JSON file for uploading two audio files to Encord.
- Template: Imports audio files with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Audio Metadata: Imports one audio file with the
audiometadata
flag. When theaudiometadata
flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate.
{
"audio": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-custom-audio-file-title.mp3",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
PDFs
The following is an example JSON file for uploading PDFs to Encord.
- Template: Imports PDFs with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Data: Imports two PDFs with no title or custom metadata.
- Custom Metadata: Imports two pdfs with a title and custom metadata.
{
"pdfs": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-file.html",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
Text Files
The following is an example JSON file for uploading text files to Encord.
- Template: Imports text files with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Data: Imports two text files with no title or custom metadata.
- Custom Metadata: Imports two text files with a title and custom metadata.
{
"text": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-file.html",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
Single Images
For detailed information about the JSON file format used for import go here.
The JSON structure for single images parallels that of videos.
Template: Provides the proper JSON format to import images into Encord.
Examples:
-
Data Imports the images only.
-
Custom Metadata: Imports images with an Encord title for the images and with custom metadata for each image. Custom metadata only appears in Active and Index as an option to filter your data. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
-
Embeddings: Imports images with an Encord title, custom metadata, and custom embeddings for each image. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
-
Image Metadata: Imports images with image metadata. This improves the import speed for your images.
{
"images": [
{
"objectUrl": "file/path/to/images/file-name-01.file-extension"
},
{
"objectUrl": "file/path/to/images/file-name-02.file-extension"
},
{
"objectUrl": "file/path/to/images/file-name-03.file-extension",
"title": "image-title.file-extension",
"clientMetadata": {
"metadata-1": "value",
"metadata-2": "value"
}
},
{
"objectUrl": "file/path/to/images/file-name-04.file-extension",
"title": "image-title.file-extension",
"clientMetadata": {
"metadata-1": "value",
"metadata-2": "value",
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
],
"skip_duplicate_urls": true
}
Image groups
For detailed information about the JSON file format used for import go here.
- Image groups are collections of images that are processed as one annotation task.
- Images within image groups remain unaltered, meaning that images of different sizes and resolutions can form an image group without the loss of data.
- Image groups do NOT require ‘write’ permissions to your cloud storage.
- Custom metadata is defined per image group, not per image. See our documentation here to learn how to add
clientMetadata
to images in an image group. - If
skip_duplicate_urls
is set totrue
, all URLs exactly matching existing image groups in the dataset are skipped.
objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
Examples:
-
Data: Imports the image groups only.
-
Custom Metadata: Imports image groups with an Encord title for the image groups and with custom metadata for each image. Custom metadata only appears in Active and Index as an option to filter your data. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
{
"image_groups": [
{
"title": "<title 1>",
"createVideo": false,
"objectUrl_0": "file/path/to/images/file-name-01.file-extension",
"objectUrl_1": "file/path/to/images/file-name-02.file-extension",
"objectUrl_2": "file/path/to/images/file-name-03.file-extension",
},
{
"title": "<title 2>",
"createVideo": false,
"objectUrl_0": "file/path/to/images/file-name-01.file-extension",
"objectUrl_1": "file/path/to/images/file-name-02.file-extension",
"objectUrl_2": "file/path/to/images/file-name-03.file-extension",
"clientMetadata": {"optional": "metadata"}
}
],
"skip_duplicate_urls": true
}
Image sequences
For detailed information about the JSON file format used for import go here.
- Image sequences are collections of images that are processed as one annotation task and represented as a video.
- Images within image sequences may be altered as images of varying sizes and resolutions are made to match that of the first image in the sequence.
- Creating Image sequences from cloud storage requires ‘write’ permissions, as new files have to be created in order to be read as a video.
- Each object in the
image_groups
array with thecreateVideo
flag set totrue
represents a single image sequence. - Custom client metadata is defined per image sequence, not per image.
- If
skip_duplicate_urls
is set totrue
, all URLs exactly matching existing image sequences in the dataset are skipped.
createVideo
flag to be set to true
. Both use the key image_groups
.objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
** Examples:**
-
Data: Imports the images groups only.
-
Custom Metadata: Imports image groups and custom metadata. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
{
"image_groups": [
{
"title": "<title 1>",
"createVideo": true,
"objectUrl_0": "<object url>"
},
{
"title": "<title 2>",
"createVideo": true,
"objectUrl_0": "<object url>",
"objectUrl_1": "<object url>",
"objectUrl_2": "<object url>",
"clientMetadata": {"optional": "metadata"}
}
],
"skip_duplicate_urls": true
}
DICOM
For detailed information about the JSON file format used for import go here.
- Each
dicom_series
element can contain one or more DICOM series. - Each series requires a title and at least one object URL, as shown in the example below.
- If
skip_duplicate_urls
is set totrue
, all object URLs exactly matching existing DICOM files in the dataset will be skipped.
.dcm
file and does not have to be specific during the upload to Encord. The following is an example JSON for uploading three DICOM series belonging to a study. Each title and object URL correspond to individual DICOM series.
- The first series contains only a single object URL, as it is composed of a single file.
- The second series contains 3 object URLs, as it is composed of three separate files.
- The third series contains 2 object URLs, as it is composed of two separate files.
For each DICOM upload, an additional DicomSeries
file is created. This file represents the series file-set. Only DicomSeries
are displayed in the Encord application.
{
"dicom_series": [
{
"title": "Series-1",
"objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series1-file.dcm"
},
{
"title": "Series-2",
"objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series2-file1.dcm",
"objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series2-file2.dcm",
"objectUrl_2": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series2-file3.dcm",
},
{
"title": "Series-3",
"objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series3-file1.dcm",
"objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/study1-series3-file2.dcm",
}
],
"skip_duplicate_urls": true
}
You can upload multiple file types using a single JSON file. The example below shows 1 image, 2 videos, 2 image sequences, and 1 image group.
{
"images": [
{
"objectUrl": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/Image1.png"
}
],
"videos": [
{
"objectUrl": "https://encord-integration.s3.eu-west-2.amazonaws.com/videos/Cooking.mp4"
},
{
"objectUrl": "https://encord-integration.s3.eu-west-2.amazonaws.com/videos/Oranges.mp4"
}
],
"image_groups": [
{
"title": "apple-samsung-light",
"createVideo": true,
"objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(32).jpg",
"objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(33).jpg",
"objectUrl_2": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(34).jpg",
"objectUrl_3": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(35).jpg"
},
{
"title": "apple-samsung-dark",
"createVideo": true,
"objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/2+(32).jpg",
"objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/2+(33).jpg",
"objectUrl_2": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/2+(34).jpg",
"objectUrl_3": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/2+(35).jpg"
}
],
"image_groups": [
{
"title": "apple-ios-light",
"createVideo": false,
"objectUrl_0": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/3+(32).jpg",
"objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/3+(33).jpg"
}
],
"skip_duplicate_urls": true
}
GCP JSON
Videos
sampling_rate
to 0
. This imports only the first frame and any key frames you specify in the video. This can significantly speed up the import of your data into Active and Index and help you to focus on only data you identify as critical.The following table provides some guidance for the examples provided after the table.
Title | Description |
---|---|
JSON for videos | Provides the proper JSON format to import videos into Encord. This template provides examples from the most basic to the most complex. |
Data | Imports videos into Encord. Why would I do this?
|
Key Frames | Imports videos with an Encord title and specifies key frames (frames of interest) for Active and Index. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If |
Custom Metadata | Imports videos with an Encord title, specifies key frames (frames of interest), and custom metadata for Active and Index. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If |
Embeddings | Imports videos with an Encord title, specifies key frames (frames of interest), custom metadata, and custom embeddings for Active and Index. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If Refer to our documentation for more information about Index with Custom Metadata, Index with Custom Embeddings, Active with Custom Metadata and Active with Custom Embeddings. |
Video Metadata | Imports videos with the videoMetadata flag. When the videoMetadata flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate. |
{
"videos": [
{
"objectUrl": "cloud-path-to-your-video-1"
},
{
"objectUrl": "cloud-path-to-your-video-2",
"videoMetadata": {
"fps": frames-per-second,
"duration": duration-in-seconds,
"width": frame-width,
"height": frame-height,
"file_size": file-size-in-bytes,
"mime_type": "MIME-file-type-extension"
}
}
{
"objectUrl": "cloud-path-to-your-video-3",
"title": "title-for-your-video-3",
"clientMetadata": {"metadata-1": "value", "metadata-2": "value"}
},
{
"objectUrl": "cloud-path-to-your-video-4",
"title": "title-for-your-video-4",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"frames": ["<frame-number-1>","<frame-number-2>","<frame-number-3>"]
}
}
},
{
"objectUrl": "cloud-path-to-your-video-5",
"title": "title-for-your-video-5",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"frames": {
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
},
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
}
}
},
{
"objectUrl": "cloud-path-to-your-video-6",
"title": "title-for-your-video-5",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"config": {
"sampling_rate": "<samples-per-second>",
"keyframe_mode": "frame" or "seconds",
},
"frames": {
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
},
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
}
}
}
],
"skip_duplicate_urls": true
}
Audio Files
The following is an example JSON file for uploading two audio files to Encord.
- Example 1 imports audio files with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Example 2 imports one audio file with the
audiometadata
flag. When theaudiometadata
flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate.
{
"audio": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-custom-audio-file-title.mp3",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
PDFs
The following is an example JSON file for uploading PDFs to Encord.
- Template: Imports PDFs with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Data: Imports two PDFs with no title or custom metadata.
- Custom Metadata: Imports two pdfs with a title and custom metadata.
{
"pdfs": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-file.html",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
Text Files
The following is an example JSON file for uploading text files to Encord.
- Template: Imports text files with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Data: Imports two text files with no title or custom metadata.
- Custom Metadata: Imports two text files with a title and custom metadata.
{
"text": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-file.html",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
Single Images
For detailed information about the JSON file format used for import go here.
The JSON structure for single images parallels that of videos.
Template: Provides the proper JSON format to import images into Encord.
Examples:
-
Data Imports the images only.
-
Custom Metadata: Imports images with an Encord title for the images and with custom metadata for each image. Custom metadata only appears in Active and Index as an option to filter your data. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
-
Embeddings: Imports images with an Encord title, custom metadata, and custom embeddings for each image. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
-
Image Metadata: Imports images with image metadata. This improves the import speed for your images.
{
"images": [
{
"objectUrl": "file/path/to/images/file-name-01.file-extension"
},
{
"objectUrl": "file/path/to/images/file-name-02.file-extension"
},
{
"objectUrl": "file/path/to/images/file-name-03.file-extension",
"title": "image-title.file-extension",
"clientMetadata": {
"metadata-1": "value",
"metadata-2": "value"
}
},
{
"objectUrl": "file/path/to/images/file-name-04.file-extension",
"title": "image-title.file-extension",
"clientMetadata": {
"metadata-1": "value",
"metadata-2": "value",
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
],
"skip_duplicate_urls": true
}
Image groups
For detailed information about the JSON file format used for import go here.
- Image groups are collections of images that are processed as one annotation task.
- Images within image groups remain unaltered, meaning that images of different sizes and resolutions can form an image group without the loss of data.
- Image groups do NOT require ‘write’ permissions to your cloud storage.
- Custom metadata is defined per image group, not per image. See our documentation here to learn how to add
clientMetadata
to images in an image group. - If
skip_duplicate_urls
is set totrue
, all URLs exactly matching existing image groups in the dataset are skipped.
objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
Examples:
-
Data: Imports the image groups only.
-
Custom Metadata: Imports image groups with an Encord title for the image groups and with custom metadata for each image. Custom metadata only appears in Active and Index as an option to filter your data. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
{
"image_groups": [
{
"title": "<title 1>",
"createVideo": false,
"objectUrl_0": "file/path/to/images/file-name-01.file-extension",
"objectUrl_1": "file/path/to/images/file-name-02.file-extension",
"objectUrl_2": "file/path/to/images/file-name-03.file-extension",
},
{
"title": "<title 2>",
"createVideo": false,
"objectUrl_0": "file/path/to/images/file-name-01.file-extension",
"objectUrl_1": "file/path/to/images/file-name-02.file-extension",
"objectUrl_2": "file/path/to/images/file-name-03.file-extension",
"clientMetadata": {"optional": "metadata"}
}
],
"skip_duplicate_urls": true
}
Image sequences
For detailed information about the JSON file format used for import go here.
- Image sequences are collections of images that are processed as one annotation task and represented as a video.
- Images within image sequences may be altered as images of varying sizes and resolutions are made to match that of the first image in the sequence.
- Creating Image sequences from cloud storage requires ‘write’ permissions, as new files have to be created in order to be read as a video.
- Each object in the
image_groups
array with thecreateVideo
flag set totrue
represents a single image sequence. - Custom client metadata is defined per image sequence, not per image.
- If
skip_duplicate_urls
is set totrue
, all URLs exactly matching existing image sequences in the dataset are skipped.
createVideo
flag to be set to true
. Both use the key image_groups
.objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
** Examples:**
-
Data: Imports the images groups only.
-
Custom Metadata: Imports image groups and custom metadata. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
{
"image_groups": [
{
"title": "<title 1>",
"createVideo": true,
"objectUrl_0": "<object url>"
},
{
"title": "<title 2>",
"createVideo": true,
"objectUrl_0": "<object url>",
"objectUrl_1": "<object url>",
"objectUrl_2": "<object url>",
"clientMetadata": {"optional": "metadata"}
}
],
"skip_duplicate_urls": true
}
DICOM
For detailed information about the JSON file format used for import go here.
- Each
dicom_series
element can contain one or more DICOM series. - Each series requires a title and at least one object URL, as shown in the example below.
- If
skip_duplicate_urls
is set totrue
, all object URLs exactly matching existing DICOM files in the dataset will be skipped.
.dcm
file and does not have to be specific during the upload to Encord. The following is an example JSON for uploading three DICOM series belonging to a study. Each title and object URL correspond to individual DICOM series.
- The first series contains only a single object URL, as it is composed of a single file.
- The second series contains 3 object URLs, as it is composed of three separate files.
- The third series contains 2 object URLs, as it is composed of two separate files.
For each DICOM upload, an additional DicomSeries
file is created. This file represents the series file-set. Only DicomSeries
are displayed in the Encord application.
{
"dicom_series": [
{
"title": "Series-1",
"objectUrl_0": "https://storage.cloud.google.com/encord-image-bucket/images/study1-series1-file.dcm"
},
{
"title": "Series-2",
"objectUrl_0": "https://storage.cloud.google.com/encord-image-bucket/images/study1-series2-file1.dcm",
"objectUrl_1": "https://storage.cloud.google.com/encord-image-bucket/images/study1-series2-file2.dcm",
"objectUrl_2": "https://storage.cloud.google.com/encord-image-bucket/images/study1-series2-file3.dcm",
},
{
"title": "Series-3",
"objectUrl_0": "https://storage.cloud.google.com/encord-image-bucket/images/study1-series3-file1.dcm",
"objectUrl_1": "https://storage.cloud.google.com/encord-image-bucket/images/study1-series3-file2.dcm",
}
],
"skip_duplicate_urls": true
}
You can upload multiple file types using a single JSON file. The example below shows 1 image, 2 videos, 2 image sequences, and 1 image group.
{
"images": [
{
"objectUrl": "https://storage.cloud.google.com/encord-image-bucket/images/Image1.png"
}
],
"videos": [
{
"objectUrl": "https://storage.cloud.google.com/encord-image-bucket/videos/Cooking.mp4"
},
{
"objectUrl": "https://storage.cloud.google.com/encord-image-bucket/videos/Oranges.mp4"
}
],
"image_groups": [
{
"title": "apple-samsung-light",
"createVideo": true,
"objectUrl_0": "https://storage.cloud.google.com/encord-image-bucket/images/1+(32).jpg",
"objectUrl_1": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(33).jpg",
"objectUrl_2": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(34).jpg",
"objectUrl_3": "https://encord-integration.s3.eu-west-2.amazonaws.com/images/1+(35).jpg"
},
{
"title": "apple-samsung-dark",
"createVideo": true,
"objectUrl_0": "https://storage.cloud.google.com/encord-image-bucket/images/2+(32).jpg",
"objectUrl_1": "https://storage.cloud.google.com/encord-image-bucket/images/2+(33).jpg",
"objectUrl_2": "https://storage.cloud.google.com/encord-image-bucket/images/2+(34).jpg",
"objectUrl_3": "https://storage.cloud.google.com/encord-image-bucket/images/2+(35).jpg"
}
],
"image_groups": [
{
"title": "apple-ios-light",
"createVideo": false,
"objectUrl_0": "https://storage.cloud.google.com/encord-image-bucket/images/3+(32).jpg",
"objectUrl_1": "https://storage.cloud.google.com/encord-image-bucket/images/3+(33).jpg"
}
],
"skip_duplicate_urls": true
}
Azure JSON
Videos
sampling_rate
to 0
. This imports only the first frame and any key frames you specify in the video. This can significantly speed up the import of your data into Active and Index and help you to focus on only data you identify as critical.The following table provides some guidance for the examples provided after the table.
Title | Description |
---|---|
Template | Provides the proper JSON format to import videos into Encord. This template provides examples from the most basic to the most complex. |
Data | Imports videos into Encord. Why would I do this?
|
Key Frames | Imports videos with an Encord title and specifies key frames (frames of interest) for Active and Index. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If |
Custom Metadata | Imports videos with an Encord title, specifies key frames (frames of interest), and custom metadata for Active and Index. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If |
Embeddings | Imports videos with an Encord title, specifies key frames (frames of interest), custom metadata, and custom embeddings for Active and Index. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If Refer to our documentation for more information about Index with Custom Metadata, Index with Custom Embeddings, Active with Custom Metadata and Active with Custom Embeddings. |
Video Metadata | Imports videos with the videoMetadata flag. When the videoMetadata flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate. |
{
"videos": [
{
"objectUrl": "cloud-path-to-your-video-1"
},
{
"objectUrl": "cloud-path-to-your-video-2",
"videoMetadata": {
"fps": frames-per-second,
"duration": duration-in-seconds,
"width": frame-width,
"height": frame-height,
"file_size": file-size-in-bytes,
"mime_type": "MIME-file-type-extension"
}
}
{
"objectUrl": "cloud-path-to-your-video-3",
"title": "title-for-your-video-3",
"clientMetadata": {"metadata-1": "value", "metadata-2": "value"}
},
{
"objectUrl": "cloud-path-to-your-video-4",
"title": "title-for-your-video-4",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"frames": ["<frame-number-1>","<frame-number-2>","<frame-number-3>"]
}
}
},
{
"objectUrl": "cloud-path-to-your-video-5",
"title": "title-for-your-video-5",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"frames": {
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
},
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
}
}
},
{
"objectUrl": "cloud-path-to-your-video-6",
"title": "title-for-your-video-5",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"config": {
"sampling_rate": "<samples-per-second>",
"keyframe_mode": "frame" or "seconds",
},
"frames": {
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
},
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
}
}
}
],
"skip_duplicate_urls": true
}
Audio Files
The following is an example JSON file for uploading two audio files to Encord.
- Template: Imports audio files with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Audio Metadata: Imports one audio file with the
audiometadata
flag. When theaudiometadata
flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate.
{
"audio": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-custom-audio-file-title.mp3",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
PDFs
The following is an example JSON file for uploading PDFs to Encord.
- Template: Imports PDFs with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Data: Imports two PDFs with no title or custom metadata.
- Custom Metadata: Imports two pdfs with a title and custom metadata.
{
"pdfs": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-file.html",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
Text Files
The following is an example JSON file for uploading text files to Encord.
- Template: Imports text files with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Data: Imports two text files with no title or custom metadata.
- Custom Metadata: Imports two text files with a title and custom metadata.
{
"text": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-file.html",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
Single Images
For detailed information about the JSON file format used for import go here.
The JSON structure for single images parallels that of videos.
Template: Provides the proper JSON format to import images into Encord.
Examples:
-
Data Imports the images only.
-
Custom Metadata: Imports images with an Encord title for the images and with custom metadata for each image. Custom metadata only appears in Active and Index as an option to filter your data. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
-
Embeddings: Imports images with an Encord title, custom metadata, and custom embeddings for each image. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
-
Image Metadata: Imports images with image metadata. This improves the import speed for your images.
{
"images": [
{
"objectUrl": "file/path/to/images/file-name-01.file-extension"
},
{
"objectUrl": "file/path/to/images/file-name-02.file-extension"
},
{
"objectUrl": "file/path/to/images/file-name-03.file-extension",
"title": "image-title.file-extension",
"clientMetadata": {
"metadata-1": "value",
"metadata-2": "value"
}
},
{
"objectUrl": "file/path/to/images/file-name-04.file-extension",
"title": "image-title.file-extension",
"clientMetadata": {
"metadata-1": "value",
"metadata-2": "value",
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
],
"skip_duplicate_urls": true
}
Image groups
For detailed information about the JSON file format used for import go here.
- Image groups are collections of images that are processed as one annotation task.
- Images within image groups remain unaltered, meaning that images of different sizes and resolutions can form an image group without the loss of data.
- Image groups do NOT require ‘write’ permissions to your cloud storage.
- Custom metadata is defined per image group, not per image. See our documentation here to learn how to add
clientMetadata
to images in an image group. - If
skip_duplicate_urls
is set totrue
, all URLs exactly matching existing image groups in the dataset are skipped.
objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
Examples:
-
Data: Imports the image groups only.
-
Custom Metadata: Imports image groups with an Encord title for the image groups and with custom metadata for each image. Custom metadata only appears in Active and Index as an option to filter your data. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
{
"image_groups": [
{
"title": "<title 1>",
"createVideo": false,
"objectUrl_0": "file/path/to/images/file-name-01.file-extension",
"objectUrl_1": "file/path/to/images/file-name-02.file-extension",
"objectUrl_2": "file/path/to/images/file-name-03.file-extension"
},
{
"title": "<title 2>",
"createVideo": false,
"objectUrl_0": "file/path/to/images/file-name-01.file-extension",
"objectUrl_1": "file/path/to/images/file-name-02.file-extension",
"objectUrl_2": "file/path/to/images/file-name-03.file-extension",
"clientMetadata": {"optional": "metadata"}
}
],
"skip_duplicate_urls": true
}
Image sequences
For detailed information about the JSON file format used for import go here.
- Image sequences are collections of images that are processed as one annotation task and represented as a video.
- Images within image sequences may be altered as images of varying sizes and resolutions are made to match that of the first image in the sequence.
- Creating Image sequences from cloud storage requires ‘write’ permissions, as new files have to be created in order to be read as a video.
- Each object in the
image_groups
array with thecreateVideo
flag set totrue
represents a single image sequence. - Custom client metadata is defined per image sequence, not per image.
- If
skip_duplicate_urls
is set totrue
, all URLs exactly matching existing image sequences in the dataset are skipped.
createVideo
flag to be set to true
. Both use the key image_groups
.objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
** Examples:**
-
Data: Imports the images groups only.
-
Custom Metadata: Imports image groups and custom metadata. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
{
"image_groups": [
{
"title": "<title 1>",
"createVideo": true,
"objectUrl_0": "<object url>"
},
{
"title": "<title 2>",
"createVideo": true,
"objectUrl_0": "<object url>",
"objectUrl_1": "<object url>",
"objectUrl_2": "<object url>",
"clientMetadata": {"optional": "metadata"}
}
],
"skip_duplicate_urls": true
}
DICOM
For detailed information about the JSON file format used for import go here.
- Each
dicom_series
element can contain one or more DICOM series. - Each series requires a title and at least one object URL, as shown in the example below.
- If
skip_duplicate_urls
is set totrue
, all object URLs exactly matching existing DICOM files in the dataset will be skipped.
.dcm
file and does not have to be specific during the upload to Encord. The following is an example JSON for uploading three DICOM series belonging to a study. Each title and object URL correspond to individual DICOM series.
- The first series contains only a single object URL, as it is composed of a single file.
- The second series contains 3 object URLs, as it is composed of three separate files.
- The third series contains 2 object URLs, as it is composed of two separate files.
For each DICOM upload, an additional DicomSeries
file is created. This file represents the series file-set. Only DicomSeries
are displayed in the Encord application.
{
"dicom_series": [
{
"title": "Series-1",
"objectUrl_0": "https://myaccount.blob.core.windows.net/encordcontainer/study1-series1-file.dcm"
},
{
"title": "Series-2",
"objectUrl_0": "https://myaccount.blob.core.windows.net/encordcontainer/study1-series2-file1.dcm",
"objectUrl_1": "https://myaccount.blob.core.windows.net/encordcontainer/study1-series2-file2.dcm",
"objectUrl_2": "https://myaccount.blob.core.windows.net/encordcontainer/study1-series2-file3.dcm",
},
{
"title": "Series-3",
"objectUrl_0": "https://myaccount.blob.core.windows.net/encordcontainer/study1-series3-file1.dcm",
"objectUrl_1": "https://myaccount.blob.core.windows.net/encordcontainer/study1-series3-file2.dcm",
}
],
"skip_duplicate_urls": true
}
You can upload multiple file types using a single JSON file. The example below shows 1 image, 2 videos, 2 image sequences, and 1 image group.
{
"images": [
{
"objectUrl": "https://myaccount.blob.core.windows.net/encordcontainer/Image1.png"
}
],
"videos": [
{
"objectUrl": "https://myaccount.blob.core.windows.net/encordcontainer/Cooking.mp4"
},
{
"objectUrl": "https://myaccount.blob.core.windows.net/encordcontainer/Oranges.mp4"
}
],
"image_groups": [
{
"title": "apple-samsung-light",
"createVideo": true,
"objectUrl_0": "https://myaccount.blob.core.windows.net/encordcontainer/1-Samsung-S4-Light+Environment/1+(32).jpg",
"objectUrl_1": "https://myaccount.blob.core.windows.net/encordcontainer/1-Samsung-S4-Light+Environment/1+(33).jpg",
"objectUrl_2": "https://myaccount.blob.core.windows.net/encordcontainer/1-Samsung-S4-Light+Environment/1+(34).jpg",
"objectUrl_3": "https://myaccount.blob.core.windows.net/encordcontainer/1-Samsung-S4-Light+Environment/1+(35).jpg"
},
{
"title": "apple-samsung-dark",
"createVideo": true,
"objectUrl_0": "https://myaccount.blob.core.windows.net/encordcontainer/2-samsung-S4-Dark+Environment/2+(32).jpg",
"objectUrl_1": "https://myaccount.blob.core.windows.net/encordcontainer/2-samsung-S4-Dark+Environment/2+(33).jpg",
"objectUrl_2": "https://myaccount.blob.core.windows.net/encordcontainer/2-samsung-S4-Dark+Environment/2+(34).jpg",
"objectUrl_3": "https://myaccount.blob.core.windows.net/encordcontainer/2-samsung-S4-Dark+Environment/2+(35).jpg"
}
],
"image_groups": [
{
"title": "apple-ios-light",
"createVideo": false,
"objectUrl_0": "https://myaccount.blob.core.windows.net/encordcontainer/3-IOS-4-Light+Environment/3+(32).jpg",
"objectUrl_1": "https://myaccount.blob.core.windows.net/encordcontainer/3-IOS-4-Light+Environment/3+(33).jpg"
}
],
"skip_duplicate_urls": true
}
OTC JSON
Videos
sampling_rate
to 0
. This imports only the first frame and any key frames you specify in the video. This can significantly speed up the import of your data into Active and Index and help you to focus on only data you identify as critical.The following table provides some guidance for the examples provided after the table.
Title | Description |
---|---|
Template | Provides the proper JSON format to import videos into Encord. This template provides examples from the most basic to the most complex. |
Data | Imports videos into Encord. Why would I do this?
|
Key Frames | Imports videos with an Encord title and specifies key frames (frames of interest) for Active and Index. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If |
Custom Metadata | Imports videos with an Encord title, specifies key frames (frames of interest), and custom metadata for Active and Index. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If |
Embeddings | Imports videos with an Encord title, specifies key frames (frames of interest), custom metadata, and custom embeddings for Active and Index. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number. Why would I do this?
Specifying a sampling_rate of 0 only imports the first frame and all key frames of your video into Active and Index.
If Refer to our documentation for more information about Index with Custom Metadata, Index with Custom Embeddings, Active with Custom Metadata and Active with Custom Embeddings. |
Video Metadata | Imports videos with the videoMetadata flag. When the videoMetadata flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate. |
{
"videos": [
{
"objectUrl": "cloud-path-to-your-video-1"
},
{
"objectUrl": "cloud-path-to-your-video-2",
"videoMetadata": {
"fps": frames-per-second,
"duration": duration-in-seconds,
"width": frame-width,
"height": frame-height,
"file_size": file-size-in-bytes,
"mime_type": "MIME-file-type-extension"
}
}
{
"objectUrl": "cloud-path-to-your-video-3",
"title": "title-for-your-video-3",
"clientMetadata": {"metadata-1": "value", "metadata-2": "value"}
},
{
"objectUrl": "cloud-path-to-your-video-4",
"title": "title-for-your-video-4",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"frames": ["<frame-number-1>","<frame-number-2>","<frame-number-3>"]
}
}
},
{
"objectUrl": "cloud-path-to-your-video-5",
"title": "title-for-your-video-5",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"frames": {
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
},
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
}
}
},
{
"objectUrl": "cloud-path-to-your-video-6",
"title": "title-for-your-video-5",
"clientMetadata": {
"metadata-1": "value", "metadata-2": "value",
"$encord": {
"config": {
"sampling_rate": "<samples-per-second>",
"keyframe_mode": "frame" or "seconds",
},
"frames": {
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
},
"<frame-number-or-seconds>": {
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
}
}
}
],
"skip_duplicate_urls": true
}
Audio Files
The following is an example JSON file for uploading two audio files to Encord.
- Template: Imports audio files with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Audio Metadata: Imports one audio file with the
audiometadata
flag. When theaudiometadata
flag is present in the JSON file, we directly use the supplied metadata without performing any additional validation, and do not store the file on our servers. To guarantee accurate labels, it is crucial that the metadata you provide is accurate.
{
"audio": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-custom-audio-file-title.mp3",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
PDFs
The following is an example JSON file for uploading PDFs to Encord.
- Template: Imports PDFs with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Data: Imports two PDFs with no title or custom metadata.
- Custom Metadata: Imports two pdfs with a title and custom metadata.
{
"pdfs": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-file.html",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
Text Files
The following is an example JSON file for uploading text files to Encord.
- Template: Imports text files with an Encord title, and with custom metadata. Custom metadata only appears in the Encord UI in Active and Index as an option to filter your data.
- Data: Imports two text files with no title or custom metadata.
- Custom Metadata: Imports two text files with a title and custom metadata.
{
"text": [
{
"objectUrl": "<object url_1>"
},
{
"objectUrl": "<object url_2>",
"title": "my-file.html",
"clientMetadata": {"optional_key_1": "optional_metadata_value_1"}
}
],
"skip_duplicate_urls": true
}
Single Images
For detailed information about the JSON file format used for import go here.
The JSON structure for single images parallels that of videos.
Template: Provides the proper JSON format to import images into Encord.
Examples:
-
Data Imports the images only.
-
Custom Metadata: Imports images with an Encord title for the images and with custom metadata for each image. Custom metadata only appears in Active and Index as an option to filter your data. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
-
Embeddings: Imports images with an Encord title, custom metadata, and custom embeddings for each image. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
-
Image Metadata: Imports images with image metadata. This improves the import speed for your images.
{
"images": [
{
"objectUrl": "file/path/to/images/file-name-01.file-extension"
},
{
"objectUrl": "file/path/to/images/file-name-02.file-extension"
},
{
"objectUrl": "file/path/to/images/file-name-03.file-extension",
"title": "image-title.file-extension",
"clientMetadata": {
"metadata-1": "value",
"metadata-2": "value"
}
},
{
"objectUrl": "file/path/to/images/file-name-04.file-extension",
"title": "image-title.file-extension",
"clientMetadata": {
"metadata-1": "value",
"metadata-2": "value",
"<my-embedding>": [1.0, 2.0, 3.0]
}
}
],
"skip_duplicate_urls": true
}
Image groups
For detailed information about the JSON file format used for import go here.
- Image groups are collections of images that are processed as one annotation task.
- Images within image groups remain unaltered, meaning that images of different sizes and resolutions can form an image group without the loss of data.
- Image groups do NOT require ‘write’ permissions to your cloud storage.
- Custom metadata is defined per image group, not per image. See our documentation here to learn how to add
clientMetadata
to images in an image group. - If
skip_duplicate_urls
is set totrue
, all URLs exactly matching existing image groups in the dataset are skipped.
objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
Examples:
-
Data: Imports the image groups only.
-
Custom Metadata: Imports image groups with an Encord title for the image groups and with custom metadata for each image. Custom metadata only appears in Active and Index as an option to filter your data. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
{
"image_groups": [
{
"title": "<title 1>",
"createVideo": false,
"objectUrl_0": "file/path/to/images/file-name-01.file-extension",
"objectUrl_1": "file/path/to/images/file-name-02.file-extension",
"objectUrl_2": "file/path/to/images/file-name-03.file-extension",
},
{
"title": "<title 2>",
"createVideo": false,
"objectUrl_0": "file/path/to/images/file-name-01.file-extension",
"objectUrl_1": "file/path/to/images/file-name-02.file-extension",
"objectUrl_2": "file/path/to/images/file-name-03.file-extension",
"clientMetadata": {"optional": "metadata"}
}
],
"skip_duplicate_urls": true
}
Image sequences
For detailed information about the JSON file format used for import go here.
- Image sequences are collections of images that are processed as one annotation task and represented as a video.
- Images within image sequences may be altered as images of varying sizes and resolutions are made to match that of the first image in the sequence.
- Creating Image sequences from cloud storage requires ‘write’ permissions, as new files have to be created in order to be read as a video.
- Each object in the
image_groups
array with thecreateVideo
flag set totrue
represents a single image sequence. - Custom client metadata is defined per image sequence, not per image.
- If
skip_duplicate_urls
is set totrue
, all URLs exactly matching existing image sequences in the dataset are skipped.
createVideo
flag to be set to true
. Both use the key image_groups
.objectUrl_{position_number}
).Template: Provides the proper JSON format to import image groups into Encord.
** Examples:**
-
Data: Imports the images groups only.
-
Custom Metadata: Imports image groups and custom metadata. This example includes the following custom metadata types: boolean, varchar, datetime, uuid, number.
{
"image_groups": [
{
"title": "<title 1>",
"createVideo": true,
"objectUrl_0": "<object url>"
},
{
"title": "<title 2>",
"createVideo": true,
"objectUrl_0": "<object url>",
"objectUrl_1": "<object url>",
"objectUrl_2": "<object url>",
"clientMetadata": {"optional": "metadata"}
}
],
"skip_duplicate_urls": true
}
DICOM
For detailed information about the JSON file format used for import go here.
- Each
dicom_series
element can contain one or more DICOM series. - Each series requires a title and at least one object URL, as shown in the example below.
- If
skip_duplicate_urls
is set totrue
, all object URLs exactly matching existing DICOM files in the dataset will be skipped.
.dcm
file and does not have to be specific during the upload to Encord. The following is an example JSON for uploading three DICOM series belonging to a study. Each title and object URL correspond to individual DICOM series.
- The first series contains only a single object URL, as it is composed of a single file.
- The second series contains 3 object URLs, as it is composed of three separate files.
- The third series contains 2 object URLs, as it is composed of two separate files.
For each DICOM upload, an additional DicomSeries
file is created. This file represents the series file-set. Only DicomSeries
are displayed in the Encord application.
{
"dicom_series": [
{
"title": "Series-1",
"objectUrl_0": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/study1-series1-file.dcm"
},
{
"title": "Series-2",
"objectUrl_0": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/study1-series2-file1.dcm",
"objectUrl_1": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/study1-series2-file2.dcm",
"objectUrl_2": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/study1-series2-file3.dcm",
},
{
"title": "Series-3",
"objectUrl_0": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/study1-series3-file1.dcm",
"objectUrl_1": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/study1-series3-file2.dcm",
}
],
"skip_duplicate_urls": true
}
You can upload multiple file types using a single JSON file. The example below shows 1 image, 2 videos, 2 image sequences, and 1 image group.
{
"images": [
{
"objectUrl": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/Image1.png"
}
],
"videos": [
{
"objectUrl": "https://encord-bucket.obs.eu-de.otc.t-systems.com/videos/Cooking.mp4"
},
{
"objectUrl": "https://encord-bucket.obs.eu-de.otc.t-systems.com/videos/Oranges.mp4"
}
],
"image_groups": [
{
"title": "apple-samsung-light",
"createVideo": true,
"objectUrl_0": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/1+(32).jpg",
"objectUrl_1": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/1+(33).jpg",
"objectUrl_2": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/1+(34).jpg",
"objectUrl_3": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/1+(35).jpg"
},
{
"title": "apple-samsung-dark",
"createVideo": true,
"objectUrl_0": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/2+(32).jpg",
"objectUrl_1": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/2+(33).jpg",
"objectUrl_2": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/2+(34).jpg",
"objectUrl_3": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/2+(35).jpg"
}
],
"image_groups": [
{
"title": "apple-ios-light",
"createVideo": false,
"objectUrl_0": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/3+(32).jpg",
"objectUrl_1": "https://encord-bucket.obs.eu-de.otc.t-systems.com/images/3+(33).jpg"
}
],
"skip_duplicate_urls": true
}
Use a Multi-Region Access Point
When using a Multi-Region Access Point for your AWS S3 buckets the JSON file has to be slightly different from the examples provided. Instead of an object’s URL, objects are specified using the ARN of the Multi-Region Access Point followed by the object name. The example below shows how video files from a Multi-Region Access Point would be specified.
{
"videos": [
{
"objectUrl": "Multi-Region-Access-Point-ARN + <object name_1>"
},
{
"objectUrl": "Multi-Region-Access-Point-ARN + <object name_2>",
"title": "my-custom-video-title.mp4",
"clientMetadata": {"optional": "metadata"}
}
],
"skip_duplicate_urls": true
}
{
"videos": [
{
"objectUrl": "https://arn:aws:s3::123123123:accesspoint/frf28frarf9.mrap.s3-accesspoint.amazonaws.com/Videos/2022/video_1.mp4"
},
{
"objectUrl": "https://arn:aws:s3::123123123:accesspoint/frf28frarf9.mrap.s3-accesspoint.amazonaws.com/Videos/2022/video_2.mp4",
"title": "many-cute-cats.mp4",
"clientMetadata": {"optional": "metadata"}
}
],
"skip_duplicate_urls": true
}
Import your Files
Register Cloud Data
- Navigate to Files section of Index in the Encord platform.
- Click into a Folder.
- Click + Upload files. A dialog appears.
- Click Import from cloud data.
Import Local Data
- Navigate to Files section of Index in the Encord platform.
- Click into a Folder.
- Click + Upload files. A dialog appears.
-
Click one of the following:
- Upload: Upload images, videos, and audio files.
- Batch images as: Upload image batches as image groups or image sequences.
- DICOM/NifTi: Upload DICOM or NifTi series.
-
Click Upload after selecting your images or series.
Your files upload into the Folder in Encord.
STEP 2: Create a Benchmark Project
The benchmark Project contains reference labels used to evaluate your annotators’ labels. These gold standard labels should be created by a trusted expert to ensure accurate assessment.
Create a Training Dataset
Create a Dataset containing tasks designed to establish ground truth labels. These files are used to generate ‘gold-standard’ labels against which annotator performance can be evaluated. Give the Dataset a meaningful name.
Create an Ontology
Create an Ontology to label your data. The same Ontology must be used in the Benchmark Project AND the Annotator Training Project.
Create the Benchmark Project
Ensure that you attach ONLY the Training Dataset to the Project.
- Go to Annotate > Projects.
- Click the + New annotation project button to create a new Project.
- Give the Project a meaningful title and description. For example “Benchmark Labels”.
- Click the Attach ontology button and attach the Ontology you created.
- Click the Attach dataset button and attach the Benchmark Dataset you created.
-
Click Invite collaborators. Add collaborators to the Project and add them to the relevant Workflow stages. You annotators should be experts you trust to create gold-standard labels.
-
Click Create project to finish creating the Project. You have now created the Project to Establish ground-truth labels.
STEP 3: Create Annotator Training Projects
Create a Project where your annotation workforce labels data and is evaluated against benchmark labels.
Create an Annotator Training Workflow Template
Create a Workflow template and give it a meaningful name like “Annotator Training”.
Create the following Workflow template for your Annotator Training Projects. Documentation on how to create new Workflow templates can be found here.
Create Annotator Training Projects
You must create one Annotator Training Project per annotator. This step must be repeated for each annotator.
Ensure that you:
- Attach the Training Dataset you created in Step 2.1 for the Benchmark Project.
- Attach the SAME Ontology you created in Step 2.2 for the Benchmark Project.
- Attach the Annotator Training Workflow Template to the Project.
- Go to Annotate > Projects.
- Click the + New annotation project button to create a new Project.
- Give the Project a meaningful title and description. For example “Annotator Training - Alex” for an annotator named Alex.
- Click the Attach ontology button and attach the Ontology you created. Attach the SAME Ontology you created in Step 2.2 for the Benchmark Project.
- Click the Attach dataset button and attach the training Dataset you created in Step 2.1.
- Click the Load from template button to attach the “Annotator Training” template you created in Step 3.1.
- Click Invite collaborators. Add the annotator you want to train in this Project to the annotation stage.
- Click Create Project to create the Project. You have now created the Project to train the selected annotator.
STEP 4: Annotator Training
Your annotators must now complete all tasks in the Annotator Training Project they are assigned to. Only tasks in the Complete stage are evaluated.
Information on how to label can be found here.
STEP 5: Evaluate Annotators
This example only evaluates Bounding Boxes.
Save and run the following script to evaluate annotator performance. The script must be run once for each Annotator Training Project. It outputs a CSV file called iou_results.csv
containing the results. The evaluation metrics used are Intersection over Union (IoU) and Class score.
-
IoU (Intersection over Union): Quantifies the overlap between predicted labels and the ground truth. It ranges from 0 to 1: 1.0: Indicates a perfect overlap between the predicted label and the ground truth. 0.0: Indicates no overlap between the predicted label and the ground truth. Values between 0 and 1: Represent the percentage of overlap. For example, an IoU of 0.6 signifies that 60% of the predicted label area overlaps with the ground truth label area.
-
Class Score (0 or 1): 1: The label was created using the correct class. 0: The label was created using the wrong class.
Ensure that you:
- Replace
<private_key_path>
with the full path to your private SSH key. - Replace
<benchmark-project-id>
with the id of your Benchmark Project. - Replace
<training-project-id>
with the id of the Training Project you want to evaluate.
from encord import EncordUserClient
from encord.objects.common import Shape
from encord.objects.coordinates import BoundingBoxCoordinates
import pandas as pd
from encord.user_client import EncordUserClient
import os
# Instantiate Encord client by substituting the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(
ssh_private_key_path="<private_key_path>"
)
training_project_id = "<training-project-id>"
benchmark_project_id = "<benchmark-project-id>"
training_project = user_client.get_project(training_project_id)
benchmark_project = user_client.get_project(benchmark_project_id)
training_label_rows = training_project.list_label_rows_v2(workflow_graph_node_title_eq='Complete')
benchmark_label_rows = benchmark_project.list_label_rows_v2(workflow_graph_node_title_eq='Complete')
# Match by data_hash
benchmark_dict = {lr.data_hash: lr for lr in benchmark_label_rows}
paired_label_rows = [
(benchmark_dict[lr.data_hash], lr)
for lr in training_label_rows
if lr.data_hash in benchmark_dict
]
# Initialise labels
with training_project.create_bundle() as bundle:
for _, prod_lr in paired_label_rows:
prod_lr.initialise_labels(bundle=bundle, overwrite=True)
with benchmark_project.create_bundle() as bundle:
for bm_lr, _ in paired_label_rows:
bm_lr.initialise_labels(bundle=bundle, overwrite=True)
# IoU calculation
def calculate_iou(bbox1: BoundingBoxCoordinates, bbox2: BoundingBoxCoordinates) -> float:
x_left = max(bbox1.top_left_x, bbox2.top_left_x)
y_top = max(bbox1.top_left_y, bbox2.top_left_y)
x_right = min(bbox1.top_left_x + bbox1.width, bbox2.top_left_x + bbox2.width)
y_bottom = min(bbox1.top_left_y + bbox1.height, bbox2.top_left_y + bbox2.height)
intersection = max(0, x_right - x_left) * max(0, y_bottom - y_top)
union = bbox1.width * bbox1.height + bbox2.width * bbox2.height - intersection
return intersection / union if union > 0 else 0.0
# Compare labels and extract information
results = []
for bm_lr, prod_lr in paired_label_rows:
prod_instances = [oi for oi in prod_lr.get_object_instances() if oi.ontology_item.shape == Shape.BOUNDING_BOX and oi.get_annotation(0)]
bm_instances = [oi for oi in bm_lr.get_object_instances() if oi.ontology_item.shape == Shape.BOUNDING_BOX and oi.get_annotation(0)]
training_data_unit_name = prod_lr.data_title
training_label_id = prod_lr.label_hash
for prod_obj in prod_instances:
best_iou = 0.0
best_match_hash = None
prod_bbox = prod_obj.get_annotation(0).coordinates
training_email = prod_obj.get_annotation(0).created_by
for bm_obj in bm_instances:
bm_bbox = bm_obj.get_annotation(0).coordinates
iou = calculate_iou(prod_bbox, bm_bbox)
if iou > best_iou:
best_iou = iou
best_match_hash = bm_obj.feature_hash
class_score = 1.0 if best_match_hash == prod_obj.feature_hash and best_match_hash is not None else 0.0
results.append({
'training_email': training_email,
'data_unit_name': training_data_unit_name,
'label_id': training_label_id,
'iou_score': best_iou,
'class_score': class_score
})
# Output the results to a CSV file
if results:
df_results = pd.DataFrame(results)
script_dir = os.path.dirname(os.path.abspath(__file__))
csv_file_path = os.path.join(script_dir, "iou_results.csv")
df_results.to_csv(csv_file_path, index=False)
print(f"Results saved to: {csv_file_path}")
else:
print("No matching label rows found for comparison.")
Was this page helpful?