Cloud-synced folders automatically stay in sync with connected cloud storage, regularly updating to ensure all new and existing files are available in Encord without manual intervention. Cloud-synced Folders require that you have at least one integration set up.
File Support and Limits
- Cloud-synced folders support images, videos, PDFs, text, HTML, and audio files.
Image groups, image sequences, and DICOM series are not currently supported by Cloud-synced folders.
- A single cloud synced folder can contain a maximum of 10 million files.
Create a Cloud-synced Folder
Create Cloud-synced Folder
from uuid import UUID
from encord import EncordUserClient
from encord.orm.storage import CloudSyncedFolderParams
# User input
SSH_PATH = "/Users/chris-encord/ssh-private-key.txt" # Specify the file path to your access key
CLOUD_SYNCED_FOLDER_NAME = "SDK Cloud-Synced Folder" # Specify a meaningful name for your Cloud-synced Folder
CLOUD_SYNCED_FOLDER_DESCRIPTION = "A folder to store my files" # Specify a meaningful description for your Cloud-synced Folder
INTEGRATION_UUID = "3b6299c3-f8c8-4755-ae26-d9144b215920" # Specify the unique id for your integration
REMOTE_URL = "gs://my-gcp-bucket/" # Specify the storage/file path to your cloud storage
# Authenticate with Encord using the path to your private key
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
ssh_private_key_path=SSH_PATH,
domain="https://api.encord.com",
)
# Create cloud synced folder params
cloud_synced_folder_params = CloudSyncedFolderParams(
integration_uuid=UUID(INTEGRATION_UUID),
remote_url=REMOTE_URL,
)
# Create the storage folder
folder_name = CLOUD_SYNCED_FOLDER_NAME
folder_description = CLOUD_SYNCED_FOLDER_DESCRIPTION
folder_metadata = {"my": "folder_metadata"}
storage_folder = user_client.create_storage_folder(
name=folder_name,
description=folder_description,
client_metadata=folder_metadata,
cloud_synced_folder_params=cloud_synced_folder_params,
)
Sync Cloud-synced Folder with Cloud Storage
The following code syncs a Cloud-sync Folder with the cloud storage bucket.
The sync_private_data_with_cloud_synced_folder_get_result time out value can be adjusted to your needs.
from uuid import UUID
from encord import EncordUserClient
from encord.orm.storage import SyncPrivateDataWithCloudSyncedFolderStatus
# User input
SSH_PATH = "/Users/chris-encord/ssh-private-key.txt" # Specify the file path to your access key
CLOUD_SYNCED_FOLDER_UUID = UUID("7270fb4a-fc8a-4336-b8dd-5b548d27889d")
# Authenticate with Encord
user_client: EncordUserClient = EncordUserClient.create_with_ssh_private_key(
ssh_private_key_path=SSH_PATH,
domain="https://api.encord.com",
)
# Get folder by UUID
storage_folder = user_client.get_storage_folder(CLOUD_SYNCED_FOLDER_UUID)
print(f"Using cloud-synced folder uuid={storage_folder.uuid}")
# Start sync job
sync_job_uuid = storage_folder.sync_private_data_with_cloud_synced_folder_start()
print(f"Started sync job: {sync_job_uuid}")
# Poll for result
result = storage_folder.sync_private_data_with_cloud_synced_folder_get_result(
sync_job_uuid,
timeout_seconds=300, # adjust as needed
)
print(f"Sync job finished with status: {result.status}")
# Handle result
if result.status == SyncPrivateDataWithCloudSyncedFolderStatus.DONE:
print("Sync completed (server finished the job).")
any_errors = (
result.scan_pages_processing_error > 0
or result.upload_jobs_error > 0
or result.upload_jobs_units_error > 0
)
print("Progress summary:")
print(
f" Bucket listing pages: "
f"pending={result.scan_pages_processing_pending}, "
f"done={result.scan_pages_processing_done}, "
f"error={result.scan_pages_processing_error}, "
f"cancelled={result.scan_pages_processing_cancelled}"
)
print(
f" Upload jobs: "
f"pending={result.upload_jobs_pending}, "
f"done={result.upload_jobs_done}, "
f"error={result.upload_jobs_error}"
)
print(
f" File units: "
f"pending={result.upload_jobs_units_pending}, "
f"done={result.upload_jobs_units_done}, "
f"error={result.upload_jobs_units_error}, "
f"cancelled={result.upload_jobs_units_cancelled}"
)
if any_errors:
print("Sync finished, but some parts failed. Inspect the *_error counters above.")
else:
print("Sync finished successfully with no reported errors.")
elif result.status == SyncPrivateDataWithCloudSyncedFolderStatus.PENDING:
print("Sync is still in progress. Try polling again later.")
elif result.status == SyncPrivateDataWithCloudSyncedFolderStatus.ERROR:
print("Sync failed (critical error).")
elif result.status == SyncPrivateDataWithCloudSyncedFolderStatus.CANCELLED:
print("Sync was cancelled.")
else:
print(f"Unexpected status: {result.status!r}")