Add Local Data to Datasets
All Datasets are identified using a unique ID called a \<dataset_hash>
. The \<dataset_hash>
can be found by clicking a Dataset in the Encord platform.
Upload private cloud data
All types of data (videos, images, image groups, image sequences, and DICOM) from a private cloud are added to a Dataset in the exact same way.
Use the script below to upload your private cloud data to a specified Dataset.
-
Replace <dataset_hash> with the ID of the Dataset you want to upload your data to.
-
Replace <integration_title> with the title of the integration you want to use. You can see all available integrations in the Encord platform, or using the SDK.
-
Replace
path/to/json/file.json
with the path to your JSON file.
"Upload is still in progress, try again later!"
, check the upload status at a later time. Check data upload
If the code returns "Upload is still in progress, try again later!"
, run the following code to query the Encord server again. Replace upload_job_id
with the output by the previous code. In the example above upload_job_id=c4026edb-4fw2-40a0-8f05-a1af7f465727
.
# Import dependencies
from encord import EncordUserClient
from encord.orm.dataset import LongPollingStatus
# Instantiate user client
user_client = EncordUserClient.create_with_ssh_private_key(ssh_private_key_path="/Users/encord/.ssh/new-key-db-private-key.txt")
# Check upload status
res = dataset.add_private_data_to_dataset_get_result(upload_job_id, timeout_seconds=5)
print(f"Execution result: {res}")
if res.status == LongPollingStatus.PENDING:
print("Upload is still in progress, try again later!")
elif res.status == LongPollingStatus.DONE:
print("Upload completed without errors")
else:
print(f"Errors: {res.errors}")
timeout_seconds
argument from the add_private_data_to_dataset_get_result() method performs status checks until the status upload has finished.Local data
Uploading videos
Use the upload_video() method to upload a video to a Dataset specified using the \<dataset_hash>
.
# Import dependencies
from encord import Dataset, EncordUserClient
# Authenticate with Encord. Replace \<private_key_path> with the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(
ssh_private_key_path="<private_key_path>"
)
# Specify the Dataset you want to upload your video(s) to. Replace \<dataset_hash> with the hash of your Dataset
dataset = user_client.get_dataset(
"<dataset_hash>"
)
# Upload the video to the Dataset by specifying the file path to the video
dataset.upload_video(
"path/to/your/video.mp4"
)
Uploading single images
Use the upload_image() method to upload a single image to a dataset specified using the \<dataset_hash>
.
# Import dependencies
from encord import Dataset, EncordUserClient
# Authenticate with Encord. Replace \<private_key_path> with the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(
ssh_private_key_path="<private_key_path>"
)
# Specify the Dataset you want to upload your images to. Replace \<dataset_hash> with the hash of your Dataset
dataset = user_client.get_dataset(
"<dataset_hash>"
)
# Upload the image to the Dataset by specifying the file path to the image
dataset.upload_image(
"path/to/your/image.jpeg"
)
Uploading image groups & image sequences
Use the create_image_group() method to combine images into image groups and image sequences, and add it to a Dataset.
Image groups
Image groups are created using the create_image_group()
method with create_video=False
as an argument. Specify the file paths of each image you want to include in the image group in the script below.
data_sequence number
, which is based on the order or the files listed in the argument to create_image_group()
. If the ordering is important to you, make sure that your filenames are listed in the correct order.
# Import dependencies
from encord import Dataset, EncordUserClient
# Authenticate with Encord. Replace \<private_key_path> with the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(
ssh_private_key_path="<private_key_path>"
)
# Specify the Dataset you want to upload your image group to. Replace \<dataset_hash> with the hash of your Dataset
dataset = user_client.get_dataset(
"<dataset_hash>"
)
# Create the image group. Include the paths of all images that are to be included in the image group.
# The create_video flag must to be set to False
dataset.create_image_group(
[
"path/to/your/img1.jpeg",
"path/to/your/img2.jpeg",
],
create_video=False
)
Image sequences
Image sequences are created using the create_image_group()
method. Image sequences can only be composed of images that have the same dimensions. Images with different dimensions are made into separate image sequences. Learn more about image sequences here.
create_video
is set to True
by default and can therefore be omitted when creating an image sequence.
# Import dependencies
from encord import EncordUserClient
# Authenticate with Encord. Replace \<private_key_path> with the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(
ssh_private_key_path="<private_key_path>"
)
# Specify the Dataset you want to upload your image sequence to. Replace \<dataset_hash> with the hash of your Dataset
dataset = user_client.get_dataset(
"<dataset_hash>"
)
# Create the image sequence. Include the paths of all images that are to be included in the image sequence.
# The create_video flag must to be set to False
dataset.create_image_group(
[
"path/to/your/img1.jpeg",
"path/to/your/img2.jpeg",
],
create_video=True
)
img1.jpeg
and img2.jpeg
are of shape [1920, 1080] and [1280, 720], respectively, each ends up in their own image sequence.Uploading DICOM series
In the following script, replace path/to/your/dicom-img1.dcm
and the other example file paths with the paths to the files you want to include in your DICOM series.
# Import dependencies
from encord import Dataset, EncordUserClient
# Authenticate with Encord. Replace \<private_key_path> with the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(
ssh_private_key_path="<private_key_path>"
)
# Specify the Dataset you want to upload your DICOM files to. Replace \<dataset_hash> with the hash of your Dataset
dataset = user_client.get_dataset(
"<dataset_hash>"
)
# Add a DICOM series to the Dataset by specifying the file path to all files to include.
dataset.create_dicom_series(
[
"path/to/your/dicom-img1.dcm"
]
)
Reading and updating data
To inspect data within a dataset use the .data_rows()
property in the Dataset class. .data_rows()
returns a list of DataRows.
Was this page helpful?